Post by ericmvan on Mar 31, 2021 18:44:31 GMT -5
This reminded me that I had written a whole nearly-finished argument for replacing the current grade system with 25 to 80, back in late October! And forgot about it.
Rereading it ... I also forgot that I used WAR to tweak Kiley McDaniel's definitions of each grade!
It's quite long, so here's the tl;dr version.
1) A system that forces you to put Bobby Dalbec and Tanner Houck either in the same bucket as Tristan Casas, or in the same with Jay Groome, is plainly not fine-grained enough to do its job.
2) The 20-80 scale, redefined, via the actual distribution of roles in MLB from 2017 to 2019. The WAR/150 figures for 45 and above are empirical; it matches Kiley's at 60, but the observed spread is 1.1 WAR per grade, not 1.0. Obviously, you'd stick with Kiley's in most cases. Kiley has 45 as 1.5 WAR and Bench at 1.0; I have to do revise the WAR scale for 45 and below and just figured out what to do differently.
Gr Kiley Me (Objective) WAR/150
80 Top 1-2 Player Best player in MLB
75 Top 2 to 3 player Top 2 to 3 player 7.3
70 Top 5 player Top 5 player 6.2
65 All-Star Perennial All-Star 5.1
60 "Plus" All-Star Candidate 4.0
55 Above-Average Regular First-Division Regular 2.9
50 Average Regular Average Regular 1.8
45 Platoon / Utility Second Division Regular 0.7
40 Bench Player Bench Player
35 Emergency Player Up-and-Down Player
30 Solid Org Player
25 Ordinary Org Player
20 Fringe Org Player
And now the full argument and analysis.
(A good chunk of this belongs in the Meta forum, e.g., under a new thread called "Grading System," and mods can feel free to move it there once folks have had a chance to get a look at the stuff of general interest).
I would love to see you replace the 2 through 8 grade scale with a 20 through 80, which would make your evaluations directly comparable to BA, MLB, and FB (among others, no doubt). That would be hugely useful.
(BA and MLB grades are realistic ceiling plus risk of not reaching, it rather than an Overall Future Production, but you can convert their system to OFP fairly well by subtracting 5 points for High risk and 10 for Very High or Extreme).
The short version of this argument: a grading system that has Bobby Dalbec and Tanner Houck, who already have had significant MLB success and shown serious tools, the same as Jay Groome and Also Ramirez (that's a typo, but it's also a good nickname!) is clearly too wide-grained to do the job it's designed for. I just have to say those names and the distinction between the two pairs of prospects is clear and anything but negligible.
So, why not promote Dalbec and Houick to 5’s? Because now they have the same grade as Tristan Casas, which is just as obviously wrong, and for the same reason. Our minds can easily handle distinctions among groups of prospects that have double the discriminatory power of a simple 8 grade system—which in practice is really 4 grades, given the rarity of 7’s here (has it ever happened?).
Let’s start by verifying if there’s a WAR-based statistical rationale for the current descriptions of all the grades, and perhaps fine-tune the definitions and/or improve the descriptions thereby. I did hitters first and will get to pitchers soon [or never!].
I used fWAR from 2017-2019 as my data. I chose Kiley McDaniel’s descriptions of the grades. What I discovered was that if you defined 80 as 4 standard deviations above average, rather than the 3 as it is for tools, it matches beautifully. And this makes sense, because when you add up a set of tools, you get extra variance.
75 is described as top 2-3 players. Mookie and Aaron Judge are the only guys who score 75—and they rank second and third.
70 is a top 5 player. Rendon, Yelich, and Bregman are 70.
But here’s the best part. 80 means best player in MLB, but Mike Trout’s an 85. And obviously an 85 score is meaningless for scouting … but just as obviously, if an 80 means, best guy in baseball, future HOF, you need an 85 for maybe the best player in MLB history.
A 65 is described as All-Star. There are 20 guys with that grade or better, which is 2.4 players per the 8.5 positions. So if you refine that to “Perennial All-Star,” this is spot-on. It’s 1.2 guys per league, which means that in an off year you’re likely a reserve rather than starting.
A 60 is described as “Plus,” which is just the definition of the grade and not really helpful! But there are 5 guys per position this good, one more than you can fit on an All-Star team. So this can be described as “All-Star Candidate.”
A 55 is “Above Average Regular.” There are 9.6 players this good per position. Call that 10 and it’s perfect; it’s the top third of regulars. Since “Above Average” technically describes the 14th best player, what we’re talking about here is “First Division Regular.” If you’ve got one of the 10 best players at a position, there’s little thought of upgrading him.
A 50 is Average, which you knew. There are 149 players this good, 5 per team and 18 per position. What we can now say is that the 11th through 13th best players at a position are “Solid Average,” the 14th through 16th are just plain Average, and the 17th and 18th are “Fringe Average.”
A 45 is described as “Platoon / Utility.” Really? What happened to all the below average regulars? This grade has to be “Second Division Regular.” These are the 19th through 25th best players at a position. Beyond that, you’re starting a guy who should be on the bench of a good team.
A 40 is a Bench Player, a 35 is better described as “Up-and-Down Player” than Kiley’s “Emergency Player,” and 30, 25, and 20 would be Solid, normal, and Fringe Organizational players.
Your 5 grade includes both 50 and 55 -- Casas is clearly the latter now -- and I believe it’s one of the reasons the 4.5 bucket, itself already the sole intermediate grade in the system, is bloated in range (I’ll get to the other in a moment).
By definition 50 is an average regular. That would mean you rank 11th to 20th among players at your position, with the reminder that not that much separates the top and bottom of that group. It's "solid” to “fringe” first division starters. It's a guy you're thinking about upgrading, especially in the lower stretch, but even then it's not quite perceived as "hole." It's hard for me to say that Dalbec now projects to be the 21st to 30th best 3B in MLB, in the sense of median outcome, which is what putting a 45 on him means. He has the one weakness, which he has continually improved, and several strengths.
And the same argument goes for Houck. Justin Masterson had a 4.3 bWAR season for the Indians throwing 99.6% fastballs and sliders, so the whole "needs a third pitch" argument is dubious. And Masterson's stuff doesn't grade out nearly as well as Houck's. Yeah, command is huge, but it's hard to regress Houck's projection to being in the bottom 1/3 of MLB pitchers after what we've seen. With both him and Dalbec I can certainly see an outcome that's just below average as the median -- Dalbec as the 17th or 18th best 3B -- but putting a 45 on a guy is saying he's more likely to be in the bottom 17% of MLB players than dead-average. That doesn't seem right for either of these guys. If you were forced to pick one, you'd pick the latter.
The current problem, of course, is that if you put a 5 on Dalbec and Houck, that puts them in the same bucket as Casas, and that's obviously wrong.
(As I said in the short version) ... A system that forces you to put Dalbec and Houck either in the same bucket as Casas, or in the same with Groome, is plainly not fine-grained enough to do its job.
But if you had a 55 for Casas, then Dalbec and Houck can be 50 where they belong.
And the system will really come in useful when we need to differentiate the 65's from the 60's.
(BA and MLB grades are realistic ceiling plus risk of not reaching, it rather than an Overall Future Production, but you can convert their system to OFP fairly well by subtracting 5 points for High risk and 10 for Very High or Extreme).
The short version of this argument: a grading system that has Bobby Dalbec and Tanner Houck, who already have had significant MLB success and shown serious tools, the same as Jay Groome and Also Ramirez (that's a typo, but it's also a good nickname!) is clearly too wide-grained to do the job it's designed for. I just have to say those names and the distinction between the two pairs of prospects is clear and anything but negligible.
So, why not promote Dalbec and Houick to 5’s? Because now they have the same grade as Tristan Casas, which is just as obviously wrong, and for the same reason. Our minds can easily handle distinctions among groups of prospects that have double the discriminatory power of a simple 8 grade system—which in practice is really 4 grades, given the rarity of 7’s here (has it ever happened?).
Let’s start by verifying if there’s a WAR-based statistical rationale for the current descriptions of all the grades, and perhaps fine-tune the definitions and/or improve the descriptions thereby. I did hitters first and will get to pitchers soon [or never!].
I used fWAR from 2017-2019 as my data. I chose Kiley McDaniel’s descriptions of the grades. What I discovered was that if you defined 80 as 4 standard deviations above average, rather than the 3 as it is for tools, it matches beautifully. And this makes sense, because when you add up a set of tools, you get extra variance.
75 is described as top 2-3 players. Mookie and Aaron Judge are the only guys who score 75—and they rank second and third.
70 is a top 5 player. Rendon, Yelich, and Bregman are 70.
But here’s the best part. 80 means best player in MLB, but Mike Trout’s an 85. And obviously an 85 score is meaningless for scouting … but just as obviously, if an 80 means, best guy in baseball, future HOF, you need an 85 for maybe the best player in MLB history.
A 65 is described as All-Star. There are 20 guys with that grade or better, which is 2.4 players per the 8.5 positions. So if you refine that to “Perennial All-Star,” this is spot-on. It’s 1.2 guys per league, which means that in an off year you’re likely a reserve rather than starting.
A 60 is described as “Plus,” which is just the definition of the grade and not really helpful! But there are 5 guys per position this good, one more than you can fit on an All-Star team. So this can be described as “All-Star Candidate.”
A 55 is “Above Average Regular.” There are 9.6 players this good per position. Call that 10 and it’s perfect; it’s the top third of regulars. Since “Above Average” technically describes the 14th best player, what we’re talking about here is “First Division Regular.” If you’ve got one of the 10 best players at a position, there’s little thought of upgrading him.
A 50 is Average, which you knew. There are 149 players this good, 5 per team and 18 per position. What we can now say is that the 11th through 13th best players at a position are “Solid Average,” the 14th through 16th are just plain Average, and the 17th and 18th are “Fringe Average.”
A 45 is described as “Platoon / Utility.” Really? What happened to all the below average regulars? This grade has to be “Second Division Regular.” These are the 19th through 25th best players at a position. Beyond that, you’re starting a guy who should be on the bench of a good team.
A 40 is a Bench Player, a 35 is better described as “Up-and-Down Player” than Kiley’s “Emergency Player,” and 30, 25, and 20 would be Solid, normal, and Fringe Organizational players.
Your 5 grade includes both 50 and 55 -- Casas is clearly the latter now -- and I believe it’s one of the reasons the 4.5 bucket, itself already the sole intermediate grade in the system, is bloated in range (I’ll get to the other in a moment).
By definition 50 is an average regular. That would mean you rank 11th to 20th among players at your position, with the reminder that not that much separates the top and bottom of that group. It's "solid” to “fringe” first division starters. It's a guy you're thinking about upgrading, especially in the lower stretch, but even then it's not quite perceived as "hole." It's hard for me to say that Dalbec now projects to be the 21st to 30th best 3B in MLB, in the sense of median outcome, which is what putting a 45 on him means. He has the one weakness, which he has continually improved, and several strengths.
And the same argument goes for Houck. Justin Masterson had a 4.3 bWAR season for the Indians throwing 99.6% fastballs and sliders, so the whole "needs a third pitch" argument is dubious. And Masterson's stuff doesn't grade out nearly as well as Houck's. Yeah, command is huge, but it's hard to regress Houck's projection to being in the bottom 1/3 of MLB pitchers after what we've seen. With both him and Dalbec I can certainly see an outcome that's just below average as the median -- Dalbec as the 17th or 18th best 3B -- but putting a 45 on a guy is saying he's more likely to be in the bottom 17% of MLB players than dead-average. That doesn't seem right for either of these guys. If you were forced to pick one, you'd pick the latter.
The current problem, of course, is that if you put a 5 on Dalbec and Houck, that puts them in the same bucket as Casas, and that's obviously wrong.
(As I said in the short version) ... A system that forces you to put Dalbec and Houck either in the same bucket as Casas, or in the same with Groome, is plainly not fine-grained enough to do its job.
But if you had a 55 for Casas, then Dalbec and Houck can be 50 where they belong.
And the system will really come in useful when we need to differentiate the 65's from the 60's.