Jacoby Ellsbury diving catch
Typical Ellsbury catch

(I see it’s been a long time since my last baseball-related post. Here’s a long one to make up, so I’m good for another year, baseball-post-wise.)

Jacoby Ellsbury, the centerfielder for the Boston Red Sox, was just given the “Defensive Player of the Year” award, as voted by baseball fans in the “This Year In Baseball Awards.”  This is interesting because, statistically, Ellsbury in 2009 was actually one of the worst defensive players in all of baseball. Does this mean that baseball fans got it wrong, or did the statistics lie to us?

The answer isn’t immediately obvious, because it’s generally agreed that statistical analysis of baseball defense1 still lags well behind most offensive and even pitching measurements. If a defender makes a catch, is that because everyone, including me, would have caught it? Was it a difficult catch, but one that most major-league players should have made? Or was it a really difficult catch, that almost every major-leaguer would have flubbed? Fans tend to judge these by the spectacularity of the catch, and there’s no doubt that Ellsbury made his share of spectacular, diving, all-out catches in 2009. But did he make them harder than they were?

In this case, actually, defensive stats and a lot of astute observers agree. Ellsbury didn’t get “good jumps” on a lot of hits this year — he hesitated before running, and he may not have run to the best spot for a catch. As a result, he had to make spectacular catches, on hits that a better defender would have caught quite routinely.

Easy, hard, or making it harder

J.D. Drew catch
Typical J.D. Drew catch

An interesting contrast to Ellsbury is the guy to his left, J.D. Drew, who plays right field for the Red Sox. Drew makes few spectacular catches, rarely diving or getting mussed up. His catches almost always look routine and easy. But statistically, he was a much better fielder than Ellsbury in 2009. In UZR/150 games, Ellsbury was -18.3, while Drew was at +15.7 — second-highest among right-fielders2 in the majors last year. Again, many careful observers agree with the stats; Drew doesn’t have to make spectacular catches, because he instantly sees where the ball will go, moves to the right place immediately, and makes the catch easily. Casual fans don’t think of Drew as a high-quality defender, because he doesn’t seem to make difficult plays; but in fact, he is making the difficult plays, he’s just making them look easy.

So UZR seems to agree fairly well with careful observers’ analysis of Ellsbury and Drew’s respective abilities. But that’s not really the interesting question. It’s mildly entertaining to say, “Yeah, baseball fans got it wrong”, but it’s much more interesting to ask what this tells us about the future. In other words: UZR seems to be an reasonable descriptive statistic. Is it a useful predictive statistic as well? For example, should Ellsbury play CF for the Sox next year? What are the chances that he’ll be a good centerfielder next year? What are the chances that Drew was just lucky this year, and next year he’ll be a lousy right fielder?

The numbers and the future

And here the waters get much more muddy. One puzzling point is that in 2008, Ellsbury was an excellent fielder, by UZR/150. He was pretty good at CF (+6.9) and superb in right field (+18.6, although in limited time — 36 games). Do players often show this kind of 20-odd point swing in UZR? And what does it tell us about that player’s future? (My answer, for those with tl;dr disease: Based on history, Ellsbury has about a 40% chance of being at an average or better fielder next year, and a 13% chance of being either good, or very good.)

Let’s ask a number of questions about UZR/150 and its ability to predict future defense.

(1) How well does a player’s UZR/150 correlate, year to year? That is, if a player has a particular UZR/150 this year, how similar will his next year’s UZR/150 be?

(2) How many players have the kind of huge drop in UZR/150 that Ellsbury showed? What happened to their defense in the years following that drop?

(3) If a defender is “very bad” this year,3 how likely is it that he’ll be a decent or good defender next year?

I’ve scraped FanGraph’s fielding ratings and dumped them into an SQLite database on my own computer, so that I can look at these questions. 4 Here are my attempts at answers.

Correlations between years for UZR/150
Correlations between years for UZR/150 (click for larger version)

(1) UZR/150 from one year does correlate with the next, but not very well. If we limit our analysis to outfielders who spent more than 65 games at a particular position in a year,5 and plot out each year vs. the following year in a scatter graph, the R2 is just 0.1823, and it only gets a little better (0.2505) if we limit it to players with at least 150 games at the position (see the figure at left).  In fact I tried all kinds of variations, and the only R2 that was over 0.5 was if I limited the analysis to the very best and the very worst outfielders6  who played over 130 games at a position; there the correlation with their next year was 0.5494.

So yes, there’s some correlation, on the bulk level.  But not much.  On an individual basis — which is, of course, what we’re interested in — you couldn’t be at all confident that next year’s UZR/150 will be very similar to this year’s.

(2) 20-point drops in UZR/150 aren’t unheard of, and players can bounce back from them. This kind of swing isn’t all that common, but it’s happened.  I turned up 24 outfielders since 20027 who had at least a 20-point change in UZR/150 from one year to the next. Of the 21 players with at least a 20-point drop, 7 of them stopped playing that position.  Six of the rest had a drop in 2009, and some of those won’t be back.  There were 12 who had a 20-point drop and continued at the position; 8 and of these, at least half bounced back, at least temporarily:

  • Andruw Jones (from 34.7 in 2005 to 13.1 in 2006; then 22.2, then 0.2, and then out of the position)
  • Kenny Lofton (19.9 in 2005; -17 in 2006; 8.3 in 2007)
  • Corey Patterson (-11 in 2002; then 14.8, 33.8, 11.3, 14.2, 1, 0.7)
  • Willy Taveras (22.6 in 2006; then -7.1, -3, and back up to 14.1)
  • Jeff Francoeur (30.1 in 2005, 7.4 in 2006, 16.9 in 2007; but then -4.9, -19 in 2008 and 2009)
  • Juan Encarnacion (all over the place. From 2003: -11.4, 13.5, -11.1, 7.1, -26.7)
  • Reggie Sanders (12.9 in 2002; then -7.2, 4.3, and -9.2)

The complete table9 is here.

Very good 1st year (n=105)
Percent at least OK Percent at least good
85.72 65.72
Good 1st year (n=271)
Percent at least OK Percent at least good
81.56 44.29
OK 1st year (n=396)
Percent at least OK Percent at least good
68.69 29.8
Bad 1st year (n=229)
Percent at least OK Percent at least good
54.14 17.9
Very bad 1st year (n=70)
Percent at least OK Percent at least good
38.57 12.86

(3) Very good and very bad defenders tend to be consistently good or bad. Although the fine correlation just isn’t there, can we draw a more general conclusion?  If we have a player who (based on UZR/150) is very good, good, just about average, bad, or very bad this year,10 what are his chances of being at least average, or of being at least good, next year? A summary of those chances, based on historical analysis of the 1071 players who qualified, is shown at the right; a more detailed breakdown is here.

What we see is that a player who was very good one year, has an 85% chance of being at least decent the following year, and a 66% chance of being good or very good. But a player who was very bad one year (for example, Jacoby Ellsbury this year) has a 40% chance of being at least decent the following year, and a 13% chance of being good or very good. So, again, there is reasonable predictive value when we look in this rather coarse-grained way, but there’s a ton of year-to-year variability.

I won’t show the data but here increasing the number of games played to 130 or 150 per year doesn’t help very much, the percents remain surprisingly similar although the numbers drop.

The bottom line

So what can we expect from Ellsbury next year (assuming he plays centerfield in 2010)?  Well, looking at the history of outfielders with that kind of drop in UZR/150, maybe there’s around a 20-50% chance that he’ll bounce back to be a decent CF (from question 2, above).  Looking at all players (question 3 above) there’s about a 40% chance that he’ll be decent, and a 12% chance that he’ll be good or very good next year.

Not great numbers, but that’s what we see.  My own suspicion is that Ellsbury will be a pretty good defender next year, but I wouldn’t put a lot of money on it.


  1. The most widely used is probably “UZR”, the “ultimate zone rating”; see here, here, and here for explanations. The other contender is the plus/minus rating system. UZR is freely available from the FanGraphs web site, while plus/minus requires a subscription to Bill James Online, so I’m only using UZR — actually, UZR/150, which is UZR normalized to 150 games.[]
  2. Those who played more than 100 games in RF[]
  3. I.e. has a low UZR/150[]
  4. Not that other, better-qualified, people haven’t already asked the questions. But poking at data is how I try to understand it, so here it is.[]
  5. Which, not entirely coincidentally, includes Ellsbury’s 2008, when he played 66 games at CF[]
  6. UZR/150 of < 15 or > 15[]
  7. When UZR was introduced[]
  8. Several players had more than one swing year, so these numbers don’t add up all that nicely[]
  9. Reminder: this is for outfielders only, not all players[]
  10. I used UZR/150 cutoffs of > +15, +5  to + 15, -5 to + 5,  -15 to -5, and less than < -15 for the different grades[]