So I figured that the actual calculations behind batting eye and selectivity would divide people into two groups
1) Those who weren’t that interested
2) Those who were and would google signal detection theory
but I’ve always meant to do a little write up and I figure now is as good a time as any since people have been asking
Let’s look back at the theory writeup which I did. We see that we need to find two measures. The first is the distance between the two distributions (batting eye) and the second is our criterion point where we decide whether to swing or not (selectivity.)
Thanks to the wonders of science we can actually caclulate both of these by using only two pieces of information.
- Hit Rate (% of strikes swung at)
- False Alarm Rate (% of balls swung at)
So let’s create two fictional batters. The first batter we’ll call MS and the second batter we’ll call JU.
We’ve observed a year worth of 2 strike pitches and we found that MS has a hit rate of 84% and a false alarm rate of 16% while JU has a hit rate of 98% and a false alarm rate of 50%.
In order to calculate each batter’s batting eye score what we want to do is find the point on each of these distributions which would give us each of these marks. We do this by running each of these percentages through the inverse cumulative distribution function (icdf) of the normal distribution.
Since we’re going to assume that the two distributions each have a mean of 0 and a variance of 1 this is pretty easy in Excel. All you have to do is type “NORMSINV(x)” where x is the probability to get this number.
From here it’s pretty simple. Batting eye is defined as the distance between the two distributions. We can find this by taking the icdf of the batter’s hit rate and subtracting the icdf of the batter’s false alarm rate from it.
Let’s take a moment and do this for the two batter.
MS’s batting eye = icdf(.84) – icdf(.16) which = 1 – (-1) so MS’s batting eye is very high with a value of 2.
JU’s batting eye = icdf(.98) – icdf(.5) which = 2 – (0) so JU’s batting eye is also 2.
They have the exact same ability to judge balls from strikes! So what explains their completely different patterns of swinging?
Well batting eye is only half the story. The other half lies in selectivity. Selectivity is a measure of how biased you are in making your errors. When it’s 0 your false alarm and miss rates are identical. Selectivity is calculated using the same information.
All you have to do to calculate selectivity is take the two icdf scores add them together and multiply by -.5. Basically you are taking the negative average of the two scores.
So for MS we have -.5 *( (icdf(.84) + icdf(.16)) which equals 0.
For JU we have -.5 *((icdf(.98) + icdf(.5)) which equals -1.
So we see that while MS is not biased and will have an equal rate of false alarms and misses, JU is hugely biased and will have a false alarm rate of over 20 times his miss rate.
While these two numbers have been changed a little they are actually almost identical to the numbers put up by two extremely different players Marco Scutaro and Juan Uribe.
I wrote a post about Scutaro here demonstrating how successful his approach has been on two strike counts. Uribe on the other hand is basically done once you get two strikes on him – despite the fact that they have similar abilities to tell balls from strikes.
Uribe is so biased against missing a strike (which on 2 strikes would result in a strikeout looking) that he becomes almost completely ineffective with two strikes on him because he’s willing to swing at so many bad pitches. Scutaro on the other hand might take a few more called strike 3s but leverages his phenomenal batting eye much more effectively and suffers very little decline on two strikes.
I believe this will be an invaluable tool in evaluating batters and could even be useful in teaching them to perform better.
I hope that this write-up answers the questions of those interested. As always please feel free to contact me with any questions or constructive feedback. Sorry once again that it took so long for me to write up the actual calculations. It’s been a crazy month and this stuff appears online in other places (just not in a baseball context) and so I had it on the back burner.