Tag: Handicap Formulas

  • More Than a Novelty – AI Score Predictions

    More Than a Novelty – AI Score Predictions

    Every golfer has asked the same question on the way to the first tee:
    “What am I going to shoot today?”

    For decades, that question lived somewhere between hope, guesswork, and superstition. Today, AI offers something different — an answer grounded in data.

    If AI is good at anything, it’s making predictions. Nearly everything you experience with AI is a response to input shaped by training on large volumes of real-world information. Golf scores are no different. With enough scores, from enough golfers, across enough courses, tees, conditions, and days, AI can learn to predict what you’re likely to shoot today or tomorrow based on how you’ve played before.

    At first glance, score prediction might sound like a novelty. Interesting. Maybe even impressive. But once you look closer, it turns out to be something much more important.


    The Prediction Is Just the Beginning

    A predicted score, by itself, is just a number.

    The real value lies in what that prediction unlocks.

    Once you can reliably estimate how a golfer is expected to score — hole by hole and tee by tee — entirely new possibilities open up. Many of the biggest challenges in golf, especially around fairness, suddenly become solvable.


    Better Handicaps Start with Bias Elimination

    At the heart of GolfHandicap.ai is a simple belief:
    The best handicap isn’t the most complex — it’s the fairest.

    Traditional handicap systems are man-made formulas, designed with good intentions, but they struggle to balance accuracy, precision, and bias elimination at the same time. As most golfers eventually discover, a system can be accurate on average and still be unfair if it consistently favors certain golfers, tees, or playing conditions.

    AI score prediction changes the conversation. Instead of inferring ability indirectly, we can measure expected performance directly — and then test outcomes against par in a meaningful way. This makes it possible to detect and reduce bias across skill levels, tees, and playing environments.

    That’s why this entire site exists. Fairness doesn’t come from elegant math alone. It comes from understanding expectations — and AI makes that possible.


    Playing Against Yourself: AI Match

    One of the more fun applications of score prediction is AI Match, available as a feature of the Electronic Scorecard in the Golf Mobile Network application.

    In this mode, you’re not playing against par or another golfer — you’re playing against your AI self: a prediction of how you normally perform on that course and tee, under similar conditions.

    Did you beat expectations? Fall short? Match them exactly?

    Suddenly, a casual round becomes a personal challenge. It’s golf, gamified — but rooted in reality, not gimmicks.


    Course and Tee Difficulty: Choosing the Right Challenge

    Most golfers want to be challenged — just not embarrassed.

    Some days you want to stretch yourself on a tougher tee, or see how you’d handle a course you’ve never played. Other days you just want to enjoy the round and stay competitive. Traditionally, tee selection has been driven by ego, habit, or guesswork.

    AI score prediction removes that uncertainty. By showing predicted scores by tee, golfers can make informed choices:

    • Want to play to a target score?
    • Curious how moving back a tee will really affect your round?
    • Looking to balance challenge with enjoyment?

    Instead of guessing, you can choose the tee that fits your game — that day.

    This also gives leagues and courses a clearer picture of true tee difficulty, based on how golfers actually score, not just what’s printed on the scorecard.


    Confidence, Range, and Expectations

    Predictions aren’t just about averages — they’re about range.

    Golfers don’t just want to know what they’ll probably shoot. They want to know:

    • What’s my best-case round?
    • How bad could it get if things go sideways?
    • If I’m chasing a target score, what are my odds?

    Using statistical measures like standard deviation, absolute deviation, and average score relative to net par, AI can estimate not just a prediction, but a confidence range — essentially, the likelihood of different outcomes.

    Understanding that range builds confidence and sets realistic expectations. It also provides a far better measure of improvement than a single great (or terrible) round ever could.


    AI Ghost (AI Sub): Solving the No-Show Problem

    As discussed in our most recent post, AI score prediction enables something leagues have struggled with for decades: handling no-shows fairly.

    With an AI Sub, a missing golfer is replaced by a prediction of their own hole-by-hole scores. The golfer who shows still plays the matchup they were scheduled for — not par, and not a random ghost.

    Handicaps remain intact. A/B positions don’t get confused. League rules still apply.

    This simply isn’t possible without AI score prediction — and it’s one of the clearest examples of prediction being far more than a novelty.


    Why This Is Different from “Just Another Formula”

    AI score prediction isn’t trying to sync with a particular handicap formula. It isn’t chasing averages or potential. It’s doing something simpler — and more powerful.

    It’s answering a single question:

    Given everything we know, what is this golfer most likely to shoot next?

    That distinction matters. It’s why AI adapts to trends, course and tee difficulty, and individual golfer behavior in ways static formulas never can.


    And This Is Only the Beginning

    Perhaps the most exciting part of AI score prediction is this:
    we don’t yet know all the ways it will be used.

    Every time we explore the data, new opportunities emerge — new features, new insights, new ways to make the game fairer, more engaging, and more fun. Many are already in development. Many more haven’t been imagined yet.

    That’s the difference between a static formula and a learning system.

    AI score predictions aren’t just another golf stat. They’re a foundation — one that supports fair handicaps, better decisions, smarter leagues, and a more honest understanding of how we actually play the game.

    And we’re just getting started.


    January 24, 2026

    Stu Healey, President

    Handicomp, Inc.

  • Understanding Bias in Handicapping

    Understanding Bias in Handicapping

    When golfers talk about handicaps, the conversation usually centers on accuracy (how close the numbers are to reality) or precision (how consistently the results hold up). Both are valuable — but the real foundation of fairness lies in something deeper: bias elimination.

    A handicap system can be accurate on average and precise in its calculations, yet still unfair if it consistently favors some golfers over others. That tilt is bias. Unlike random error, bias is systematic — it shapes outcomes in one direction, rewarding some while penalizing others.


    What the Data Shows

    We analyzed scores from the same 16-round, 48-golfer, 18-hole league referenced in our previous blog post. The league played from three sets of tees — Green, White, and Blue — with golfers divided into two flights: Birdie and Bogey. For each, we compared four handicap formulas: Custom, HGHS, AI, and WHS™. (As a reminder, Custom averages the middle 3 of the last 5 scores.)

    The test was straightforward: How do average net scores compare to par (72)?

    • If the average net is close to 72 → the system is fair.
    • If it consistently runs high or low → that’s bias.

    Green Tee (43 scores, smallest dataset):

    • AI and Custom: close to par, sometimes a little under.
    • HGHS: low for good golfers, high for higher handicaps (–1.3 to +1.4).
    • WHS™: consistently high (+2.3 to +3.7).
      ⚠️ With only 43 scores, confidence is limited, but the trend matches other tees.

    White Tee (271 scores, largest dataset):

    • AI and Custom: nearly neutral (–0.5 to +0.4).
    • HGHS: slightly high (+1.5 to +1.9).
    • WHS™: heavily upward biased (+3.7 to +5.3).
      ✅ With 271 rounds, this is the anchor evidence: WHS™ systematically tilts results upward, while AI and Custom remain closer to fair.

    Blue Tee (87 scores, mid-sized dataset):

    • AI: close to par (+0.2 to +0.6).
    • Custom: slightly high (+0.3 to +0.5).
    • HGHS: consistently high (+2.0 to +3.6).
    • WHS™: the most biased (+3.5 to +7.9).
      Results mirror the White Tee, reinforcing the conclusion.

    Tee and Flight Bias

    Bias doesn’t just show up across formulas — it also shows up across tees and golfer flights:

    • Tee Bias: On tougher tees (Blue), most formulas under-adjust, leaving golfers with net scores well above par. On easier tees (Green), WHS™ in particular overcompensates, inflating net scores unfairly. By contrast, AI and Custom hold closest to par — but AI does so with lower standard deviation and absolute deviation, proving it’s not just fairer but also more consistent.
    • Flight Bias: Higher-handicap golfers (Bogey Flight) suffered most under WHS™, with net scores climbing as high as +7.9. That’s a clear sign of systemic unfairness. Custom held closer to par, while AI not only kept both Birdie and Bogey flights balanced but also delivered tighter results round to round.

    This reinforces what we saw in the previous blog: AI not only leads on accuracy and precision, it also outperforms on fairness. In short, AI edges out Custom by combining balance with consistency, while WHS™ consistently fails both tests.


    Why Does Custom Fare Well?

    Custom, as an average-based system, performs well because it works in a contained environment like a league. Golfers usually compete under the same structure, on the same course, and from consistent tees — which removes many of the outside variables that complicate handicapping. In that setting, a simple average of recent scores tracks reality closely and fairly, without overcorrecting.

    But AI goes further. By learning from historical patterns and factoring in variables such as scoring trends, course conditions, and golfer tendencies, it can anticipate shifts that a simple average misses. That’s why AI not only stays fair like Custom but also delivers tighter accuracy and precision.


    Score Usage Bias

    Another source of distortion is score usage bias.

    • WHS™ includes all rounds — both league and outside play.
    • Custom, HGHS, and AI use only league rounds.

    That difference matters. League rounds are structured and competitive, making them directly comparable across golfers. Casual rounds vary widely — away courses, easier setups, looser play, different intensity. By blending them in, WHS™ creates handicaps that don’t reflect league play, giving golfers an uneven match.


    Potential vs. Average

    Handicap systems don’t all measure the same thing:

    • WHS™ & HGHS (Potential-Based): Designed to reflect what you could shoot on a good day by dropping poor scores and weighting toward upside. In practice, this punishes inconsistent golfers and rewards steady ones, often inflating net scores — especially for higher-handicap players with more variability.
    • Custom & AI (Average-Based): These reflect what golfers actually score, good and bad included. By smoothing overall performance, handicaps stay closer to real scoring tendencies. In practice, this keeps net scores near par — the very definition of fairness.

    So does potential vs. average change the bias discussion? No — it sharpens it. Dropping “bad” rounds may sound fair in theory, but the data shows it creates more bias. Average-based systems track reality better, especially in league play where fairness matters most.


    Testing for Bias

    Bias isn’t always obvious, which is why testing is essential. A fair handicap system should pass a few core checks:

    1. Net vs. Par: Mean net scores should hover near par (±0.5).
    2. Group Comparisons: Results should be fair across men and women, low- and high-handicappers, steady and inconsistent golfers.
    3. League vs. Non-League: Adding outside scores shouldn’t dramatically shift handicaps.
    4. Error Direction: Errors shouldn’t consistently skew high or low.

    Correcting Bias:

    • Use comparable scores → base league handicaps on league rounds only.
    • Calibrate tees properly → always adjust for tee difficulty.
    • Don’t overweight potential → dropping too many rounds punishes inconsistent golfers.
    • Monitor outcomes → regularly test net averages across groups.
    • Leverage AI → machine learning detects subtle patterns of bias and adapts faster than static formulas.

    Why It Matters

    Golfers will forgive small misses in accuracy or precision. What they won’t forgive is the feeling that the system is rigged. Eliminating bias is what builds trust — and trust is what keeps golfers engaged, leagues healthy, and competition meaningful.


    ✅ Takeaway

    In our previous blog post, “What is the Best Handicap Formula for My Golf League?” the results showed that AI was the clear winner:

    • Lowest Standard Deviation → most consistent week to week.
    • Lowest Absolute Deviation → closest match to reality.
    • Net Scores Near Par → golfers consistently “played to their handicap.”

    In this post, we build on that foundation by showing that AI is also the least biased formula:

    • AI delivers the most balanced, unbiased results.
    • Custom is fair and average-based, but less accurate and precise.
    • HGHS trends high, though less extreme than WHS™.
    • WHS™ is consistently biased upward, especially for higher-handicap golfers.

    Across three tees, two flights, and hundreds of scores, the message is clear: bias — not accuracy or precision — is the real test of fairness. Filtering out “bad” scores may sound logical, but in practice it tilts the system. Average-based methods keep competition closer to par, and AI — trained on two decades of real league data — delivers the fairest, most trustworthy handicap of all.


    September 21, 2025

    Stu Healey, President

    Handicomp, Inc.

  • Average of Last X Scores

    Average of Last X Scores

    Finding the Optimal Number of Scores for an Average-Based Handicap

    When golfers are first asked their handicap, the common response is to say they average a certain score — for example:

    “I usually shoot 90 for 18 holes, so subtract par 72 and that gives me a handicap of 18.”

    In their mind, that makes perfect sense. Golfers instinctively understand that handicaps should reflect what they typically score, not just what they shoot on their best day.

    But that leads to an important question:

    If handicaps are based on averages…

    how many scores should be included in the average?

    Too few scores and the handicap becomes overly reactive, swinging wildly from round to round.

    Too many scores and the handicap becomes diluted, slow to recognize improvement or decline.

    Somewhere in the middle lies an optimal balance between:

    • accuracy,
    • precision,
    • responsiveness,
    • and fairness.

    So we decided to test it.


    The Test

    Using a large volume of league scores, we generated handicap calculations based on averaging the golfer’s last:

    • 1 score
    • 2 scores
    • 3 scores
    • all the way to 20 scores

    We then measured:

    • Mean → bias relative to par
    • Standard Deviation → consistency and precision
    • Absolute Difference → closeness to actual outcomes (accuracy)

    The lower the Standard Deviation and Absolute Difference, the better the formula performs.

    The testing was separated into:

    • Men’s 18-hole
    • Men’s 9-hole
    • Women’s 18-hole
    • Women’s 9-hole

    The results were remarkably consistent across all four datasets.



    The Results

    Across all four groups, the same overall pattern emerged:

    • Very small score samples (1–3 rounds) produced highly volatile handicaps.
    • As additional scores were included, variability dropped and predictions tightened.
    • Around the middle range, performance stabilized.
    • Beyond that point, additional scores produced diminishing returns while responsiveness began to decline.

    The strongest overall performance consistently clustered around:

    13 scores

    That consistency across:

    • genders,
    • hole formats,
    • and golfer populations
      is important.

    It strongly suggests this is not coincidence, but a natural balance point between randomness and stability.

    By the time the calculation reached roughly 13 rounds:

    • Mean remained close to neutral,
    • Standard deviation flattened,
    • Absolute difference stabilized,
    • and additional scores added very little predictive benefit.

    In short:

    13 scores appeared to provide the best balance between stability and responsiveness.


    Why 13 Makes Sense

    What makes this especially interesting is that the number 13 appears repeatedly in other predictive industries.

    Not because 13 is magical — but because many systems naturally converge around an optimal balance between:

    • signal,
    • noise,
    • recency,
    • and stability
      when using roughly that amount of historical information.

    Examples include:

    Finance

    Financial trend models frequently rely on:

    • 12–14 period moving averages,
    • 13-week trend windows,
    • and quarterly rolling models
      to smooth volatility while remaining responsive to change.

    Insurance & Actuarial Modeling

    Risk models seek enough historical observations to stabilize prediction without overweighting outdated behavior.

    Sports Analytics

    Player projection systems often use rolling windows in this range because:

    • too little data overreacts,
    • too much data ignores trend.

    Machine Learning & Time-Series Forecasting

    Predictive systems repeatedly discover that:

    • recent behavior matters most,
    • but enough history is needed to separate randomness from true pattern.

    Golf appears to behave similarly.


    Why Not Use More Scores?

    This is where many handicap systems begin to struggle.

    Using too many scores creates what is better described as:

    score dilution.

    As more and more rounds are added into the calculation, each individual score carries less weight. The result is a handicap that becomes increasingly anchored to long-term history instead of current reality.

    Older rounds continue influencing the handicap long after they stop representing the golfer:

    • old injuries,
    • old swing mechanics,
    • previous skill levels,
    • different playing frequency,
    • or even entirely different competitive environments.

    The problem isn’t merely slow reaction.

    The problem is dilution of meaningful change.

    A golfer improves…
    but the handicap remains inflated by older poor rounds.

    Or the golfer declines…
    and the handicap stays artificially low because stronger historical scores continue pulling it down.

    Either way, the handicap stops reflecting who the golfer is now.

    That’s why the optimal number of scores matters:

    • too few creates volatility,
    • too many creates dilution,
    • and somewhere in the middle lies the balance between responsiveness and stability.

    In our testing, that balance consistently landed around:

    13 scores.


    Why This Matters

    Most golfers instinctively think about handicaps as averages — and they’re not wrong.

    But averages only work well when:

    • the playing environment remains reasonably consistent,
    • enough rounds smooth randomness,
    • while not including so many that current ability becomes diluted.

    That’s especially true in contained environments like golf leagues, where:

    • golfers often play the same course,
    • rotate between familiar tees,
    • compete under similar conditions,
    • and maintain consistent frequency of play.

    In those settings, average-based handicapping can work surprisingly well — provided the number of scores used is optimized correctly.

    Our testing suggests that balance point is:

    13 scores.

    Not 3.
    Not 20.
    Not “best 8 of last 20.”

    About 13.


    Final Thought

    Golf handicapping has spent decades trying to estimate future performance from past scores.

    The interesting part is this:

    Even before AI enters the discussion, the data itself quietly points toward balance, trend, and prediction — not simply raw averaging.

    And in our testing, that balance consistently landed on:

    13 scores.

    Not because 13 is special, but because it appears to represent the point where:

    • randomness becomes smoothed,
    • trends still remain visible,
    • and the handicap best reflects who the golfer actually is right now.

    September 5, 2025

    Stu Healey, President

    Handicomp, Inc.