The Mysterious NFL Passer Rating

So WTF is the NFL's passer rating, really?
It's one of those stats announcers and talking heads on TV like to use to make a point when it's mentioned only when it's very high or very low. I knew only a few things (and I think is the case for most people) about passer ratings: Greater values are "better", it ranges from 0 to 158.3, and sometimes a quarterback would be better off throwing all their attempts into the ground, or so the announcer would say.

This prompted several questions. How is this rating calculated? Are greater values always better? Why is the max 158.3? And what is with that weird scenario I hear about where a QB would be better off with all incompletions? I'll hit on a few of the key concepts and background of passer ratings, but if you want to read more on this the NFL has goes into more detail here.

Quick Background of Passer Ratings

The point of this passer rating system is to compare passers' performances from season to season -- the way the NFL describes the metric implies comparing passer ratings for individual games or partway through games isn't really a great idea. As we dive into how the rating is calculated this will become clear as to why. Passer rating is also meant to evaluate purely passing and not necessarily quarterback play. There are other aspects of playing quarterback that go beyond pure passing and the passer rating is not meant to capture those The NFL states as much in their description.

ESPN has developed a metric called Total QBR that has gained some traction since its debut in 2011 and its value is supposed to be a representative measure of the quarterback's total performance (hence the name, I guess). The exact formula for calculating Total QBR isn't published but there is some information out there in the wild (i.e. internet). Total QBR mainly relies on the Expected Points Added (EPA) metric. We won't dive into that...at least not yet.

The NFL's passer rating uses four common stats as the basis for its components: 1) completion percent, 2) yards per attempt, 3) interception percent, 4) touchdown percent. Each component can contribute a minimum of 0 and a maximum of 39.583* to the total rating. Why is important to point out? As as example, once a passer throws an interception (which guarantees less than the max for that component) it is impossible to get a "perfect" rating, meaning super-exceptional performances in other components cannot make up for the interception. Don't worry, of course I'll provide an interactive graphic to mess around with this.
*The NFL's passer rating description site says the max values for each component is 2.375. However, at the end of the passer rating calculation each value is added together, divided by 6, and multiplied by 100 to rescale the rating value. By saying the max value is 39.583 just means I rescaled first to break the rating into its addend components. This doesn't change anything, trust me.

Breaking Down Passer Rating by Components

How are these minimums and maximums for each component determined? The NFL website I mentioned above references some of the better season-long stats as a basis for each component. For example, per the NFL's passer rating description sit and validated using Pro Football Reference, the record for yards per attempt (YPA) for a single season (min 100 att.) is 11.17 by Tommy O'Connell (Cleveland, 1957). For whatever reason, the passing rating creators decided that a YPA of 12.5 should max out that component. I assume it's not a completely arbitrary choice and that there's more behind it. Anyway, any completion that increases a passers YPA past 12.5 will no longer increase the rating. What about the minimum? Any YPA below a baseline 3.0 (again, no reason why the threshold is 3.0) will zero out that component.

Use the interactive visualization below by customizing passer stats to see how each component contributes to the total passer rating. Test out what happens when a passer attempts a bunch of passes with no completions, no yards, but no interceptions.

In the scenario described above, not only does having no completions, no yards, with no interceptions have a non-zero rating, it maxes out the interception component. So we see that while three of the passer rating components reward passers for very good performances, component based on interception percentage gives rewards for not being bad -- as long as a passer has yet to throw an interception that component is maxed out.

We've seen how passer ratings can be decomposed into four components. Are each of these components created equal? That is to say, does each component tend to contribute the same to overall passer ratings? In short, it used to but not much anymore. Thanks again to Pro Football Reference's Play Index tool I pulled three data sets:

Top 100 Season-Long Passer Ratings (Since 1978, min 300 attempts)
2019 Passer Ratings (through week 12, min 100 attempts)
1978 Passer Ratings (min 100 attempts)

Let's see if there any patterns within the passer rating component values. Use the interactive visual below to take a closer look at the three data sets. Dragging a box and double clicking will zoom in. Another double click will reset the scatterplot.

It's quite clear that there is a pattern emerges when looking at more modern passer ratings. The completion and interception percentage components contribute to total passer ratings much more than the others, but this is not the case for the data from 1978. What causes this disparity? Over a long enough span of games the min and max restrictions won't play a role in the total rating. The reason for the gap in component values come from the "baselines" mentioned earlier in this post. The baseline values serve as a starting point in which a passer increases. In the YPA example above, 3.0 is the baseline. For completion percent it's 30% Essentially as the NFL passing game modernized completion percentages have increased across the board. A quarterback completing under 60% of his passes is well below the norm now, but that wasn't the case 30 or 40 years ago. At the time of the passer rating's implementation the baselines put each component on a similar enough level, but now that is far from the case.

So What Do We Make of Passer Rating?

As with any stat (and this isn't just in sports), a singular value does not tell a complete story. The goal of the NFL's passer rating is to piece together more parts of that story purely as a passer. On it's own, however, is the passer rating a useful metric? It has it's strengths in that it is simple to calculate and tries to provide one number in which to make comparisons. However, it should be used in the right context. We saw some instances that show how using passer rating on individual game (or even partway through games) is not a great idea. Additionally, the rating's greatest deficiency is how two components seem to now contribute more to the passer rating than it is meant to. My opinion: it's not useless and can be generally useful, but all with a rather large grain of salt.

That ends it for this post. I hope everyone now has a deeper understanding of the widely used passer rating metric. As always, feel free to reach out with any feedback via email or through twitter!
Tweet to @MathWithJerome Follow @MathWithJerome

First Post! What to expect from fantasy football starters (ideally, at least)?

Let's get into what I'm going to call starter-worthy production. If you're in a 12-team league, with the typical two RB slots to fill, then the top 24 running backs are typically deemed "starters." To get a sense of how things may go for you this week, you visit your favorite rankings site to check out where your guys fall in the top 20 to 30 players at the position. But what point production should you feel good about from those players? Can we find a point threshold that helps identify production worthy of being in your starting lineup? Let's take a look... I'm defining starter-worthy production for each week as the top 24 scores for RB and WR, and the top 12 scores for QB and TE. The dataset consists of the top scores for each week including 50 for RB/WR, 18 for QB, and 20 for TE. The splits were needed for simplicity and don’t change the analysis. Looking at the visualization below, the blue histogram shows what starter-worthy produ...

Math With Jerome

Search This Blog