The Best Box Score
Blog/·8 min read

The Five-Year Verdict: Which Umpires Got Better, Which Got Worse, and Who Should Be Worried

Before ABS, we had five years of Statcast umpire data. Here's who was consistently elite, who was consistently below their peers, and what the pitch clock actually changed.

umpiresanalysislaunch

We have five years of Statcast-verified umpire performance data: 2021 through 2025, covering 12,449 games and roughly 486 umpire-seasons. The pitch clock arrived in 2023. Umpire evaluation debates intensified. And now, with the ABS challenge system live in 2026, human umpiring enters a new chapter.

Before that chapter begins, here's the definitive look at the one that just ended.

The League-Wide Picture

Umpire accuracy has been remarkably stable across the five-year window:

| Season | Avg Accuracy | Best Ump | Worst Ump | Accuracy Range | |--------|-------------|----------|-----------|---------------| | 2021 | ~93.3% | — | — | ~89-96% | | 2022 | ~93.4% | — | — | ~89-96% | | 2023 | ~93.3% | — | — | ~89-96% | | 2024 | ~93.5% | — | — | ~90-96% | | 2025 | ~93.6% | James Jean (95.0%) | Roberto Ortiz (93.1%) | ~93-95% |

The league average hasn't moved meaningfully. The pitch clock didn't make umpires more or less accurate in aggregate. What did change was the variance — the gap between the best and worst narrowed slightly in the post-clock era (2023-2025), suggesting the faster pace may have brought some standardization.

The 2025 Leaderboard

The best and worst umpires in our final pre-ABS season:

Top 5 (2025):

| Umpire | Accuracy | Games | Missed/Game | |--------|----------|-------|-------------| | James Jean | 95.0% | 23 | 7.6 | | Derek Thomas | 95.0% | 25 | 7.5 | | Junior Valentine | 94.8% | 23 | 8.1 | | Mark Carlson | 94.6% | 5 | 7.8 | | Shane Livensparger | 94.5% | 30 | 8.2 |

Bottom 5 (2025):

| Umpire | Accuracy | Games | Missed/Game | |--------|----------|-------|-------------| | D.J. Reyburn | 93.2% | 32 | 10.0 | | Malachi Moore | 93.2% | 30 | 10.7 | | Ben May | 93.1% | 31 | 10.4 | | Jordan Baker | 93.1% | 32 | 10.3 | | Roberto Ortiz | 93.1% | 28 | 10.5 |

The gap between the best and worst is roughly 2 percentage points — about 95% vs. 93%. That sounds small, but it translates to 2-3 extra missed calls per game for the worst umpires, compounded over a full season. At 30+ games behind the plate, those extra misses add up to real run value and real win probability impact.

The Names That Keep Showing Up

Some umpires appear on our worst-calls leaderboards year after year. Not because they're unlucky — because they consistently miss more calls in bigger moments.

Brian O'Nora had the single worst call of 2025 (a 43.4% WPA swing against Jakob Marsee) and also appeared on the 2021 worst-calls list. Over five seasons, his name surfaces repeatedly in high-leverage miss compilations.

Laz Diaz appeared on both the 2021 and 2022 worst-calls top 10. His zone has been consistently wider than the rulebook across multiple seasons.

Tripp Gibson appeared on both the 2024 and other season worst-calls lists. Multiple high-WPA misses suggest a pattern, not an aberration.

On the positive side, some umpires demonstrate consistent elite performance:

Pat Hoberg has been widely recognized as one of the best umpires in baseball, and our data supports the reputation. His accuracy consistently places him in the top tier across multiple seasons, with an era-relative grade of A or above.

Shane Livensparger appeared in the top 5 in 2025 with 94.5% accuracy across a full 30-game workload. His consistency at high volume is what separates genuine elite performance from small-sample luck.

The Pitch Clock Effect

The pitch clock arrived in 2023. Our data covers two full seasons before it (2021-2022) and three after (2023-2025). Did it change umpire performance?

Aggregate accuracy: no meaningful change. The league average moved from roughly 93.3-93.4% to 93.3-93.6%. If the clock affected accuracy, the effect is smaller than year-to-year noise.

Variance: possibly compressed. The gap between the best and worst umpires appears slightly narrower in the post-clock era. Whether this reflects the pace change, general umpire development, roster turnover, or statistical noise is hard to isolate.

Handedness gap: persistent. The systematic difference in how umpires call the zone for left-handed vs. right-handed batters has not narrowed post-clock. This bias appears structural, not pace-dependent.

What the Era-Relative Grades Show

Our grading system normalizes each umpire's accuracy to the season mean. An A+ in 2021 (when the league was slightly less accurate) represents the same relative performance as an A+ in 2025. This lets us compare across eras.

The distribution is designed to be bell-shaped:

  • A+ and F grades are rare (roughly 5-10% of umpires each)
  • Most umpires cluster in the B range (roughly average)
  • The tails reveal the genuine outliers

What stands out across five seasons: the umpires who earn A+ grades tend to sustain them. Elite accuracy is a skill, not luck. Similarly, umpires who earn D or F grades tend to stay in the bottom tier. Below-average accuracy is also a persistent trait.

The middle (B range) is more volatile — umpires move in and out of the average band season to season. But the extremes are sticky.

The ABS Baseline

Everything in this five-year dataset is now historical. The 2026 ABS challenge system means that the worst calls — the ones at the top of our worst-calls leaderboards, the 3-2 count phantom strikeouts in the 9th inning — are now challengeable.

Our pre-ABS data provides the baseline for measuring ABS's impact:

  • How much does umpire accuracy improve when umpires know their calls can be overturned?
  • Does the handedness gap narrow when the outside corner to lefties can be challenged?
  • Do high-leverage misses decrease because players save challenges for the biggest moments?
  • Which umpires improve the most — and does improvement correlate with how badly they needed it?

We'll be answering these questions throughout the 2026 season, with five years of historical context to compare against.


Explore the full umpire leaderboard with era-relative grades on our umpire ratings page, or read about how our grading system works.