Friday, December 5, 2008

Simplifying the QB Rating System

This graph illustrates a comparison between the QB Rating and only the components for Completions and Interceptions.  The components for Yards per Attempt and Touchdowns are ignored.  The data shows a correlation of 0.970.  Interestingly, the Yards per Attempt and Touchdowns components, when added together, have a slight negative (-0.17) correlation with the QB Rating calculation.  In a previous post, I had shown that correlations of just the Completions component with the QB Rating was 0.956, and the Interceptions component was 0.965, respectively.  What this shows, at least to me anyway, is that we could simplify the QB Rating to only include 2 components - Completion Rate and Interception Rate.

The QB Rating formula is as follows:

Q = (J+K+L+M)*100/6

where,
Q = QB Rating
J = max(min(C, 2.375), 0)
K = max(min(Y, 2.375), 0)
L = max(min(T, 2.375), 0)
M = max(min(I, 2.375), 0)

and where,
C = ([Completions/Attempts]*100 - 30)/20
Y = ([Yards/Attempts] - 3)/4
T = [Touchdowns/Attempts]*20
I = (2.375 - [Interceptions/Attempts]*25)

Here's how we will go about simplifying the QB Rating calculation:

As discussed above, we can eliminate the K and L terms, since the J and M terms have an almost perfect positive correlation with the QB Rating.

We get, 

Q ≈ k*(J + M)*100/6

where,
k = non-zero scalar

Let's further simplify the equation by doing the following:

Let's remove the restrictions on a max and a min for each term (in my opinion, these restrictions are arbitrary, and when looking at annual league average data, the restrictions haven't come into play since 1943).

And, by removing the approximation and by substituting back for the J and M terms, we get

Q*100/6 = ([Completions/Attempts]*100 - 30)/20 + 2.375 - [Interceptions/Attempts]*25

Re-arranging terms, we get

(Q*100/6 - 2.375 + 1.5)/5 = [Completions/Attempts] - [Interceptions/Attempts]*5

or,

Q(new) = C/A - b*I/A

where,

Q(new) = New QB Rating
C/A = Completions/Attempts
I/A = Interceptions/Attempts
b = factor to be determined (in my preliminary research, it appears that b would be a number close to 3)

So, there we have it, a (much) simplified new QB Rating System.  It only looks at completion rate and interception rate, and no fancy mathematical gyrations.  It is easy to understand, easy to explain, not arbitrary, simple to calculate and it also has the amazing beauty of never exceeding 100.0%!  It still does have the issue of not being comparable over time (i.e. players in the 1940's versus players from the 1990's).  I will address this issue in an upcoming post.

For simplicity, I'd like to call this new rating system CMI, or, short for Completions Minus Interceptions (sort of like the OPS statistic in baseball - on-base plus slugging).

8 comments:

Anonymous said...

Your statistic is going to favor (not punish) those QBs that are simply told "don't blow it". It is much easier to be efficient if you have a great defense and a good running game. Those QBs may end up with a decent rating in your system, but they won't have impressive numbers and probably aren't even be the leader of their team. You should consider factoring in something for volume (total yards or % of teams total yards). You don't want the Garcia's and the Pennington's beating out Brees and Manning every year.....do you? my two cents..out!

Anonymous said...

My biggest question is the relevance one. Simplifying the NFL's formula is good but is there any gain in output quality.

Your metric correlates well with the NFL formula over the years meaning that it captures the way the game has changed over the years in a similar fashion. It will fail at rating cross eras as it doesn't match up with the important factors that have stated the same. In a similar fashion it might not be that good at intrayear comparisons.

If it doesn't capture the meaningful portions of the game that have stated the same, does it say much about relative value inside a given year?

KiranR said...

Dan,

You have a good point. My point in this article though, is that if the NFL isn't going to change the current Passer Rating system, then why not use something simpler, such as my suggestion. It has the same flaws, but does the same thing. It has a 98%+ correlation with the current stat. The new stat has the elegance of being simple, accurate (in terms of the current Passer Rating system), easy to explain, and has the beauty of never being greater than 100%.

I am in the process of further refining it, and coming up with an efficient way of using such a stat to compare QBs over time.

Thanks for the comments.....

Cheers,
Kiran

Anonymous said...

This has been something I have been fascinated with for a couple years now, how to simplify the QB rating. I couldn't really figure something out so bravo to you. I am impressed and jealous.
But I thought of something different. What if you can compared a QB stats to winning a game? Well that is what I did. So my rating is more of the QB increasing the team's chances of winning a game. My buddy has a blog and I am going to post in the next week.

KiranR said...

It's difficult, at best, to assign win probability solely to how a quarterback performs. There are so many elements of the game that the quarterback does not control, such as, the running game, defense, place-kicking, punting, and kick-off/punt returns. That being said, there are elements that a quarterback can control that do affect the outcome of the game, such as interceptions.

There are a few sites out there trying to develop quarterback rating systems that are associated with win probability. I, however, will concentrate on looking at quarterbacks independent of win probabilities.

Anonymous said...

I understand that there are other factors in winning a football game. The number I got is something you can describe as a quarterback putting his team in the best position to win a game.

It's really amazing how strong some of the graphs are in determining an out come of a game. I've used game stats from the past 10 years and found that TD, Turnovers, Yrd/play, Sacks and completion percentage all have correlations .970 and above withing winning a game.

So all I did was average out the formulas for each stat and that shows if a QB is putting his team in a position to win or lose.

I just thought it's hard to describe what the QB rating actually means and this formula actually had something easy to explain.

I really want to continue to go back more years to refine my formula but it takes awhile to get the data.

KiranR said...

I'd be very interested in learning more about your formula. Let me know when you publish it.

Anonymous said...

Alright it's posted, tell me what you think.
http://kotitescorner.blogspot.com/2009/03/its-time-for-new-qb-rating.html