Sunday, June 7, 2009

Do-It-Yourself - Understanding Performance Evaluation

Later this evening I will be posting the Leaderboards from our 2009 Division I Performance Ratings. I thought I’d use this opportunity to discuss a little more about the ratings. At Diamond Futures, our performance ratings form the cornerstone of all of our analysis. In order to try to forecast what will happen sometime in the future, we need to have a solid understanding of what the current state is. Performance Ratings are the tool that we use to assess current state.

Essentially we have determined that there are five statistical factors of performance that correlate strongly with future success as a player moves up the chain. While they all don’t correlate equally, individually they all contribute to our ‘one number’ assessment of past performance. With them, we have demonstrated that we can correlate current/past performance with future performance at a coefficient of just under .50. While for some of you that means absolutely nothing, and for others your first reaction may be “that isn’t extremely high”, the reality is that this type of correlation against future performance, blows away anything else currently available. More importantly, we haven’t just used this method for a couple of years, this has been a continual refinement process that now stretches more than a decade, and has been tested on tens of thousands data points. And the beauty is that it is nearly as accurate using college data as it is using minor league data—in spite of the wood vs. aluminum bat issues.

Pitchers and Hitters each have five characteristics of their statistical profile that we look at. Age vs. Level of competition is a component of both. For hitters the other four are 1) Ability to hit for power- measured by a proprietary power calculation; 2) Ability to hit for Average – using the standard batting average calculation; 3) Ability to exercise good strike zone control and judgement – measured by the propensity to both strike out and walk; and 4) Speed – which is again measured by a proprietary calculation based on readily available stats. For pitchers, the four additional characteristics are 1) Dominance – measured by strikeout rates; 2) Control – measured by the rate at which walks are issued; 3) HRrate, which is merely a measure of how often HRs are allowed and 4) Stamina – measured by the average outing length. Each of these factors is calculated using data that is normalized for both Parks and Levels of Play. While we work with them in units of standard deviation, we will always express them in similar fashion to the 20-80 range that is used in scouting profiles. If you see a Control figure for a pitcher with a rating of 50, this means that his normalized Control performance is roughly minor league average--50 is the mean. A score of 80 or a score of 20 roughly equates to three standard deviations above or below the mean. Once we have all of the individual components, we combine them into a single numerical calculation that typical ranges between 2.00 on the high side and negative 3.00 on the low side. Players that are generally considered to have strong chances as a prospect have a positive overall score.

I have tried to keep this as simple as possible. Our goal is not to bury the reader in statistics, but to shine a little light on how we use statistics to develop very real, very positive results. While the very nature of what we do requires us to constantly express things in numerical terms, you will find that, whenever feasible, we try to not cloud the picture with heavy statistical analysis. The best way to understand what we do is to actually see the results. Some of the first results you will have are the Division I Performance Ratings. If there are things that require more explanation feel free to email us or leave a comment in the ‘Mailbag’.

No comments:

Post a Comment