Post by ericmvan on Sept 20, 2013 15:59:53 GMT -5
The Abreu thread quotes a scouting report that says Abreu will struggle against quality pitchers but feast on bad ones, and hence have plenty of MLB value.
That caught me eye, because for years I've been regarding opponent quality splits as an underutilized tool, especially for projecting post-season play.
Truly elite pitchers tend to level the field, so that great hitters don't hit them much better than bad ones (what I call the "Enrique Wilson effect," after the utility infielder who hit .264 / .382 / .485 in 35 career PA against Pedro). You can use pitcher's splits by batting order position as a quick-and-dirty proxy for that. A good staff with ordinary or big splits, 3 through 6 hitters versus 7 through 9, projects much less well in the post-season than a staff with small splits, since they will be facing lineups full of virtual 3 through 6 hitters. I've used that to predict the Sox domination of the '04 Series, Melancon's struggles in the AL East, etc. (Ill run those splits before each post-season series, but I've already noticed that Lester has very flat splits, while Scherzer is ordinary).
Hitters are just as interesting. Vince Gennaro's award-winning presentation at this year's SABR conference included those in a new methodology for predicting batter-pitcher matchups (new to the public -- it's essentially a high-tech version of the model I used for the Sox, and the underlying logic is straightforward enough that I'm sure other teams have worked on it in private). His examples were Chris Davis (steep split, feasting on lousy pitching) versus Derek Jeter (flat split, hitting elite pitchers better than average but failing to exploit lousy ones).
So, what kind of opponent quality splits does Mike Napoli have? Does he have more or less value than he might seem to have, once we take them into consideration? Do the splits tell us anything about him as a player?
The Methodology.
NB: I'll follow this up late tonight with a geekage post that will go more deeply into the methodology, including various caveats / sources of noise. Please don't bother questioning me about the methodology until that goes up!
First, I grabbed the ERA- for the years 2006-2013 of every pitcher Napoli has faced in his career. I sorted them by ERA- and used common sense and my knowledge of how to divide single-season ERA- into #1 starter, #2 starter, etc. buckets, to come up with the groupings you'll see below.
I picked Paul Konerko as a first comp / baseline. He has a .369 wOBA from 2006-13, versus Napoli's .370, and strikes me as a guy with a similar reputation as a hitter. (He' also a converted catcher, although I remembered that much later, since he made the conversion after his first two years in the minors). For a second comp, I picked Prince Fielder, a better hitter from 2006-13 (.390 wOBA) but a guy we are likely to face in the post-season.
When I looked at the initial data, I saw some funkiness, and the reason soon became clear: unlike the other buckets, PA versus my supposed #1 pitchers (ERA- less than 85) were actually mostly against relievers -- closers and good set-up guys (it's much easier to post a low ERA coming out of the pen). So I divided that group into starters and relievers, classifying guys who have done both (Feliz, Bard, Duchscherer, etc.) by the role they actually filled against each of the hitters.
Finally, my resulting #1 starter group was mostly guys who seemed to be legitimate #1's, but about 30% were guys for whom that classification seemed to be sketchy at best. Dividing them by whether they had pitched 300 innings from 2006-2013 did an almost perfect job of matching my subjective take; all I had to do was move Roger Clemens and Matt Harvey from the extra, sketchy group (hereafter SS for Small Sample, instead of SP) to the real one. (I'll list the pitchers in the followup post.)
If you have a problem with this final tweak to the methodology, I have two responses: first, I have no doubt that I could come up with a rule that would include Clemens and Harvey objectively, but doing so wasn't worth my trouble; and second, if you still have a problem, please don't respond here -- go upstairs and complain to your mother.
The Results.
Paul Konerko 2006-2013:
Prince Fielder:
These are as expected, I think. The slopes of the trendline through my six main buckets are .015 and .014 per bucket, respectively, but this is largely driven by the feasting they've done against crap. The below average pitchers are hit harder than average ones, although for these two guys, there's no difference between just above or just below average (essentially #3 versus #4 starters). Obviously, for all hitters take together, there would be. And they both do less well against good pitchers.
Now, here's Mike Napoli.
Now, this is not normal. First of all, he's been terrible against good and elite relievers. But compared to his baseline against average pitchers, he's been remarkably good against aces and #2's, and he has benefited much less than expected by feasting on lousy pitchers. The slope of the trendline through my six buckets is just .007. He has been much better than Konerko vs. #1 and #2 starters, and better than Fielder versus #1 and just as good versus #2. IOW, this is a guy who can be expected to be a significantly better post-season player than a regular-season one. Against #1 and #2 starters, he has been an elite hitter.
What might be driving this? Two or three thoughts. First, as a catcher, he is very likely to have a better understanding of pitch sequencing and how pitchers work hitters than the average player does. Second, hitters definitely look at video of all their PA's versus a day's SP, but I doubt they do the same for all the good guys in the pen. Third, you accumulate many more PA's versus starters than you do relievers, so that video library is much less helpful for relievers. This would explain the starter / reliever split, and it also suggests that knowledge of how a pitcher works you may be something that helps flatten the opponent quality split. Why that might be true is something I need to think about it.
Next, I'll test the looking-at-video hypothesis by breaking down Napoli's EqA by both opponent quality and the number of times he's faced him.
Finally, there may be a wealth of insight to be garnered from a thorough study of these splits. All three guys, for instance, have a lower BA but much higher walk rate against #2 starters than #1's. That makes sense if the #2's tend to pitch around / nibble against good hitters more than #1's do.
That caught me eye, because for years I've been regarding opponent quality splits as an underutilized tool, especially for projecting post-season play.
Truly elite pitchers tend to level the field, so that great hitters don't hit them much better than bad ones (what I call the "Enrique Wilson effect," after the utility infielder who hit .264 / .382 / .485 in 35 career PA against Pedro). You can use pitcher's splits by batting order position as a quick-and-dirty proxy for that. A good staff with ordinary or big splits, 3 through 6 hitters versus 7 through 9, projects much less well in the post-season than a staff with small splits, since they will be facing lineups full of virtual 3 through 6 hitters. I've used that to predict the Sox domination of the '04 Series, Melancon's struggles in the AL East, etc. (Ill run those splits before each post-season series, but I've already noticed that Lester has very flat splits, while Scherzer is ordinary).
Hitters are just as interesting. Vince Gennaro's award-winning presentation at this year's SABR conference included those in a new methodology for predicting batter-pitcher matchups (new to the public -- it's essentially a high-tech version of the model I used for the Sox, and the underlying logic is straightforward enough that I'm sure other teams have worked on it in private). His examples were Chris Davis (steep split, feasting on lousy pitching) versus Derek Jeter (flat split, hitting elite pitchers better than average but failing to exploit lousy ones).
So, what kind of opponent quality splits does Mike Napoli have? Does he have more or less value than he might seem to have, once we take them into consideration? Do the splits tell us anything about him as a player?
The Methodology.
NB: I'll follow this up late tonight with a geekage post that will go more deeply into the methodology, including various caveats / sources of noise. Please don't bother questioning me about the methodology until that goes up!
First, I grabbed the ERA- for the years 2006-2013 of every pitcher Napoli has faced in his career. I sorted them by ERA- and used common sense and my knowledge of how to divide single-season ERA- into #1 starter, #2 starter, etc. buckets, to come up with the groupings you'll see below.
I picked Paul Konerko as a first comp / baseline. He has a .369 wOBA from 2006-13, versus Napoli's .370, and strikes me as a guy with a similar reputation as a hitter. (He' also a converted catcher, although I remembered that much later, since he made the conversion after his first two years in the minors). For a second comp, I picked Prince Fielder, a better hitter from 2006-13 (.390 wOBA) but a guy we are likely to face in the post-season.
When I looked at the initial data, I saw some funkiness, and the reason soon became clear: unlike the other buckets, PA versus my supposed #1 pitchers (ERA- less than 85) were actually mostly against relievers -- closers and good set-up guys (it's much easier to post a low ERA coming out of the pen). So I divided that group into starters and relievers, classifying guys who have done both (Feliz, Bard, Duchscherer, etc.) by the role they actually filled against each of the hitters.
Finally, my resulting #1 starter group was mostly guys who seemed to be legitimate #1's, but about 30% were guys for whom that classification seemed to be sketchy at best. Dividing them by whether they had pitched 300 innings from 2006-2013 did an almost perfect job of matching my subjective take; all I had to do was move Roger Clemens and Matt Harvey from the extra, sketchy group (hereafter SS for Small Sample, instead of SP) to the real one. (I'll list the pitchers in the followup post.)
If you have a problem with this final tweak to the methodology, I have two responses: first, I have no doubt that I could come up with a rule that would include Clemens and Harvey objectively, but doing so wasn't worth my trouble; and second, if you still have a problem, please don't respond here -- go upstairs and complain to your mother.
The Results.
Paul Konerko 2006-2013:
ERA- PA BA OBP SA EqA
< 85 RP 534 .264 .354 .496 .290
< 85 SS 28 .423 .464 .769 .402
< 85 SP 422 .267 .329 .455 .269
85-92 654 .249 .324 .460 .268
93-99 616 .283 .357 .511 .295
100-109 1158 .293 .376 .461 .292
110-129 1048 .287 .378 .515 .305
130+ 326 .343 .426 .632 .354
Prince Fielder:
ERA- PA BA OBP SA EqA
< 85 RP 549 .259 .383 .412 .284
< 85 SS 51 .422 .490 .600 .376
< 85 SP 388 .272 .358 .447 .281
85-92 936 .267 .376 .477 .296
93-99 830 .302 .395 .581 .328
100-109 1216 .275 .379 .496 .302
110-129 1233 .285 .399 .553 .324
130+ 423 .327 .411 .730 .369
These are as expected, I think. The slopes of the trendline through my six main buckets are .015 and .014 per bucket, respectively, but this is largely driven by the feasting they've done against crap. The below average pitchers are hit harder than average ones, although for these two guys, there's no difference between just above or just below average (essentially #3 versus #4 starters). Obviously, for all hitters take together, there would be. And they both do less well against good pitchers.
Now, here's Mike Napoli.
ERA- PA BA OBP SA EqA
< 85 RP 370 .188 .319 .295 .229
< 85 SS 23 .182 .217 .500 .228
< 85 SP 328 .267 .360 .488 .291
85-92 488 .251 .375 .474 .295
93-99 487 .259 .349 .528 .295
100-109 774 .280 .359 .521 .298
110-129 615 .265 .364 .556 .308
130+ 226 .301 .385 .622 .333
Now, this is not normal. First of all, he's been terrible against good and elite relievers. But compared to his baseline against average pitchers, he's been remarkably good against aces and #2's, and he has benefited much less than expected by feasting on lousy pitchers. The slope of the trendline through my six buckets is just .007. He has been much better than Konerko vs. #1 and #2 starters, and better than Fielder versus #1 and just as good versus #2. IOW, this is a guy who can be expected to be a significantly better post-season player than a regular-season one. Against #1 and #2 starters, he has been an elite hitter.
What might be driving this? Two or three thoughts. First, as a catcher, he is very likely to have a better understanding of pitch sequencing and how pitchers work hitters than the average player does. Second, hitters definitely look at video of all their PA's versus a day's SP, but I doubt they do the same for all the good guys in the pen. Third, you accumulate many more PA's versus starters than you do relievers, so that video library is much less helpful for relievers. This would explain the starter / reliever split, and it also suggests that knowledge of how a pitcher works you may be something that helps flatten the opponent quality split. Why that might be true is something I need to think about it.
Next, I'll test the looking-at-video hypothesis by breaking down Napoli's EqA by both opponent quality and the number of times he's faced him.
Finally, there may be a wealth of insight to be garnered from a thorough study of these splits. All three guys, for instance, have a lower BA but much higher walk rate against #2 starters than #1's. That makes sense if the #2's tend to pitch around / nibble against good hitters more than #1's do.