Post by Colin Loftin on Feb 4, 2017 0:57:11 GMT -5
So I've been playing around with some data trying to use it to model this coming draft's prospects. I've been working on it for a little while and figured it'd be interesting to take a look back at recent drafts and see how it would have done. The 2012 draft class's NBA stats were not used in the sample data for obvious reasons. I've included a table with the results, if you want a sortable one here's one in Google sheets: docs.google.com/spreadsheets/d/1nMxiGYTlHLiFN2BhIN0dg71xGgic_v7-ZLlNcpt7vzE/edit?usp=sharing
I've set up the model trying to predict VORP per minute which is why it's included. I chose VORP as it's essentially just a rate and minutes relevant version of BPM. You can't have a high VORP without playing a significant number of minutes. While it's rare and regresses fairly quickly, it is possible to have a high BPM having not played too many minutes. I could have accounted for this by restricting the sample data set with a lower threshold for minutes played. But I think there is value added in including players that haven't played many minutes.
The DX Rank column is where DraftExpress had the player ranked in their final pre-draft rankings. The two model rank columns are the ranks from the model output. The DX Incl. version takes into account the DraftExpress rankings. This represents a sort of scouting proxy. The Stats only column doesn't take into account rankings and is only using stats from their college days. All the college and NBA stats were pulled from Basketball-Reference.
One obvious note, there are no International players included. It's been more difficult to get consistent stats for foreign players. And they'll likely need to be incorporated in a different manner. Weighting seasons in a different method, etc. I've been looking into the stats at DraftExpress as they seem to have a more complete set of both college and international stats.
Onto some observations...
First the good! AD being #1 is a good look. He was by far the highest rated value in the draft due to his high 2pt%, free throw rate, and rebound rate, all as a 19 year old. I don't know how much credit can be taken from getting this one right, but hey, it's just as, if not more, important to hit on the obvious ones as it is to get a steal later on.
Draymond and Jae being rated so highly is excellent, especially relative to both their DraftExpress rank and their actual draft slots. They're somewhat similar prospects as they were both college seniors entering the draft, and both posted pretty good steal rates over their college careers.
Middleton being as high as we was surprised me. While he wasn't a standout in any one category he did shoot a high rate of threes at a decent percentage and was a solid all-around player in most other categories.
Alright, onto the bad. The first and biggest swing and miss was Lillard. Just an enormous failure. He's actually accumulated the most VORP in the 2012 draft class, surpassing even AD (though with the benefit of almost 3000 extra minutes). The ranking-agnostic version pegged him at 27, while including the rankings only bumped him up to 14. I think the likely reason is I haven't yet set up the model to handle positions. So while his season-weighted fg% and rebounding numbers would rate well for a guard, when big men are included it ranks pretty low. He was also an older prospect from a small school who played weak schedules. And while the track record for those guys (I'm looking at you Jimmer) isn't great, putting him at 27 is a gigantic miss.
Lillard was really the only big miss in terms of rating a player far too low. On the flip side of that, having Tony Wroten and Kris Joseph in the top 12 when guys like Barnes and Henson are ranked in the 20's would have led to some definite regrets, especially if Barnes can show he's capable of being the man in Dallas. Thomas Robinson is another bust that was too highly rated, though at least it was good to have him a bit lower than his pre-draft ranking.
I was a little surprised to see MKG has accrued so little VORP. He definitely hasn't played nearly as many minutes as some of the top guys, but with his proficiency in rebounds, steals, and blocks I'd have figured he would have produced more. However, his OBPM is downright putrid so I guess it makes sense.
If not for Lillard it would've been a roaring success, but overall it was a pretty decent result. Returning some pretty good value in those first 6 picks and finding the two biggest steals in the draft.
I've set up the model trying to predict VORP per minute which is why it's included. I chose VORP as it's essentially just a rate and minutes relevant version of BPM. You can't have a high VORP without playing a significant number of minutes. While it's rare and regresses fairly quickly, it is possible to have a high BPM having not played too many minutes. I could have accounted for this by restricting the sample data set with a lower threshold for minutes played. But I think there is value added in including players that haven't played many minutes.
The DX Rank column is where DraftExpress had the player ranked in their final pre-draft rankings. The two model rank columns are the ranks from the model output. The DX Incl. version takes into account the DraftExpress rankings. This represents a sort of scouting proxy. The Stats only column doesn't take into account rankings and is only using stats from their college days. All the college and NBA stats were pulled from Basketball-Reference.
One obvious note, there are no International players included. It's been more difficult to get consistent stats for foreign players. And they'll likely need to be incorporated in a different manner. Weighting seasons in a different method, etc. I've been looking into the stats at DraftExpress as they seem to have a more complete set of both college and international stats.
Onto some observations...
First the good! AD being #1 is a good look. He was by far the highest rated value in the draft due to his high 2pt%, free throw rate, and rebound rate, all as a 19 year old. I don't know how much credit can be taken from getting this one right, but hey, it's just as, if not more, important to hit on the obvious ones as it is to get a steal later on.
Draymond and Jae being rated so highly is excellent, especially relative to both their DraftExpress rank and their actual draft slots. They're somewhat similar prospects as they were both college seniors entering the draft, and both posted pretty good steal rates over their college careers.
Middleton being as high as we was surprised me. While he wasn't a standout in any one category he did shoot a high rate of threes at a decent percentage and was a solid all-around player in most other categories.
Alright, onto the bad. The first and biggest swing and miss was Lillard. Just an enormous failure. He's actually accumulated the most VORP in the 2012 draft class, surpassing even AD (though with the benefit of almost 3000 extra minutes). The ranking-agnostic version pegged him at 27, while including the rankings only bumped him up to 14. I think the likely reason is I haven't yet set up the model to handle positions. So while his season-weighted fg% and rebounding numbers would rate well for a guard, when big men are included it ranks pretty low. He was also an older prospect from a small school who played weak schedules. And while the track record for those guys (I'm looking at you Jimmer) isn't great, putting him at 27 is a gigantic miss.
Lillard was really the only big miss in terms of rating a player far too low. On the flip side of that, having Tony Wroten and Kris Joseph in the top 12 when guys like Barnes and Henson are ranked in the 20's would have led to some definite regrets, especially if Barnes can show he's capable of being the man in Dallas. Thomas Robinson is another bust that was too highly rated, though at least it was good to have him a bit lower than his pre-draft ranking.
I was a little surprised to see MKG has accrued so little VORP. He definitely hasn't played nearly as many minutes as some of the top guys, but with his proficiency in rebounds, steals, and blocks I'd have figured he would have produced more. However, his OBPM is downright putrid so I guess it makes sense.
If not for Lillard it would've been a roaring success, but overall it was a pretty decent result. Returning some pretty good value in those first 6 picks and finding the two biggest steals in the draft.
Player | VORP | DX Rank | Model Rank (DX Incl.) | Model Rank - Stats only |
Anthony Davis | 16.3 | 1 | 1 | 1 |
Michael Kidd-Gilchrist | 1.3 | 4 | 2 | 2 |
Jared Sullinger | 4.1 | 20 | 5 | 3 |
Draymond Green | 15 | 26 | 8 | 4 |
Jae Crowder | 6.6 | 45 | 16 | 5 |
Bradley Beal | 4.7 | 3 | 4 | 6 |
Tony Wroten | -1.6 | 30 | 17 | 7 |
Maurice Harkless | 2.6 | 19 | 6 | 8 |
Will Barton | 1.1 | 29 | 9 | 9 |
Andre Drummond | 7.2 | 9 | 7 | 10 |
Khris Middleton | 4.1 | 49 | 32 | 11 |
Kris Joseph | -0.1 | 57 | 23 | 12 |
Thomas Robinson | -1.2 | 2 | 3 | 13 |
Terrence Jones | 3.1 | 17 | 11 | 14 |
Jeremy Lamb | 0.6 | 13 | 10 | 15 |
Quincy Miller | -0.5 | 31 | 28 | 16 |
Perry Jones | -1.1 | 18 | 13 | 17 |
Royce White | 0 | 21 | 21 | 18 |
Marquis Teague | -1.3 | 25 | 29 | 19 |
Harrison Barnes | 3.1 | 5 | 18 | 20 |
Dion Waiters | -1.5 | 7 | 12 | 21 |
Jeff Taylor | -1.2 | 28 | 30 | 22 |
Jared Cunningham | -0.6 | 42 | 37 | 23 |
Doron Lamb | -1.2 | 36 | 34 | 24 |
John Henson | 3.5 | 11 | 20 | 25 |
Tyshawn Taylor | -0.8 | 35 | 42 | 26 |
Damian Lillard | 16.5 | 6 | 14 | 27 |
Arnett Moultrie | -0.2 | 15 | 15 | 28 |
Fab Melo | 0 | 24 | 25 | 29 |
Kendall Marshall | -2.8 | 16 | 22 | 30 |
Quincy Acy | 0.5 | 37 | 39 | 31 |
Miles Plumlee | 0.6 | 34 | 38 | 32 |
Kim English | -0.2 | 38 | 43 | 33 |
Kyle O'Quinn | 2.7 | 44 | 41 | 34 |
Meyers Leonard | -0.1 | 12 | 19 | 35 |
Robbie Hummel | -0.3 | 56 | 44 | 36 |
Darius Miller | -0.2 | 39 | 47 | 37 |
Austin Rivers | -2.1 | 8 | 24 | 38 |
Darius Johnson-Odom | -0.1 | 50 | 33 | 39 |
John Jenkins | -0.7 | 32 | 40 | 40 |
Andrew Nicholson | -3.2 | 22 | 27 | 41 |
Bernard James | 0.2 | 33 | 35 | 42 |
Terrence Ross | 2.3 | 14 | 26 | 43 |
Orlando Johnson | -0.5 | 41 | 46 | 44 |
Mike Scott | -0.4 | 47 | 45 | 45 |
Festus Ezeli | 0.6 | 27 | 36 | 46 |
Tyler Zeller | 0.8 | 10 | 31 | 47 |
Kevin Murphy | -0.2 | 52 | 48 | 48 |