So I’m finally going to describe a little bit about the science behind these measures. When I first thought about this stuff ages ago I did a search to see if anyone had used it on pitch recognition. A really old academic article saying it might be possible came up but there was nothing very recent. I recently did another search and found a handful of articles mentioning similar lines of thought. A couple of those can be found http://www.insidethebook.com/ee/index.php/site/comments/measuring_plate_discipline/ or http://www.baseballprospectus.com/article.php?articleid=9744 so other very smart people have thought along the same lines although I’m not sure anyone has really investigated it to the fullest extent as I hope to.
I’ve been throwing out these terms Batting Eye and Selectivity without truly explaining them and where they come from. When looking at signal detection theory the basic premise is that pitches fall into two distributions. One distribution is the noise distribution – which distracts us from the other distribution, signal. In baseball terms the signal are bad pitches to swing at (ie balls – unless you are Pablo Sandoval.) The signal are the pitches you want to swing at (strikes.) Since the two types of pitches can be difficult to tell apart we are sometimes going to swing at bad pitches or not swing at good pitches. Signal detection theory measures our ability to tell the good from the bad (sensitivity or in our terms Batting Eye) and it also tells us which direction we are biased (Sensitivity.)
This diagram is one of the basic diagrams which sums up signal detection theory. In this diagram the distribution of balls is the blue distribution to our left. The distribution of strikes is our red distribution to the right. Notice that they overlap. The overlap is the part where it’s impossible to tell a ball from a strike. We are just going to hold the sigma value constant at 1 for simplicity which means that d’ is basically the distance between the two distributions. This becomes our Batting Eye score. As the two distributions get further apart (higher Batting Eye) they overlap less and it becomes easier to tell balls from strikes. This means that batters will not have to worry about trading off as much when deciding how much to swing.
Selectivity is related to (but is not the actual) Beta which we see in the diagram. The line drawn represents a criterion – the batter will make a judgement as to where on the x axis each pitch falls (based on how much evidence that it will be a strike vs be a ball there is.) If the batter’s bias (Selectivity) is exactly 0 then the criterion line would be a little the the right of where it is, exactly where the two distributions are equal. This will ensure an equal rate of both types of error (swinging at balls and not swinging at strikes.)
In fact there are four different outcomes to each pitch – 2 positive and 2 negative.
Successes are when you swing at a strike (“Hit” which is somewhat confusing in baseball) or when you don’t swing at a ball (“Correct Rejection.”) The two errors are to swing at a ball or to not swing a strike.
Note here that this is the reason why I think the true measure of a batter’s eye is their Batting Eye on 2 strike counts. Before two strikes batters may choose to have a smaller zone of pitches where they are more productive which they want to swing at. Not swinging at borderline strikes may not really be a ‘miss’ in these cases – it may actually just be their higher level judgement at play.
A positive Selectivity rating means that you are more concerned with ‘false alarms’ than ‘misses.’ You are willing to not swing at some pitches which are in the strike zone – in order to wait for a pitch you can be more productive on. The higher it is the more you are trying to avoid swinging at bad pitches. On the other hand a negative rating means that you are biased in the other direction. You want to swing at the maximum number of good pitches without much regard for bad pitches.
As we’ve seen OPS seems to go up the higher your selectivity rating is, within batter. I believe this is because when a batter is more selective they are forgoing pitcher’s strikes and waiting for balls which are in a location they know they can drive.
I hope I’ve explained this theory in a way which isn’t too complex and which anyone interested can understand. I also hope I’ve explained some of the conclusions which I’ve come to based on it to this point. I intend to do more work with this because I feel like it has a lot to tell us about a number of things: which approaches at the plate are most productive, when each one should be used, when each one is being used and how each hitter views the art of hitting. I also think it can be flipped to tell us which pitchers are pitching most strategically, which pitches are the hardest to see properly and more. Pitch f/x is an extremely rich data set and I believe this is one of the tools which will help us interpret the data and the game.