Saturday, January 31, 2009

Super Bowl Quarterbacks

Super Bowl XLIII is tomorrow.  So, I thought it might be a good exercise to review quarterback performances in each of the past 42 Super Bowls.  The first table below shows how each of the starting quarterbacks did.  Here are a few observations.

Quarterbacks in the Super Bowl make lots of mistakes.  Look at the quarterbacks on the losing team.  Until Super Bowl XXV between the Giants and Bills, every starting quarterback on a losing team had thrown at least one interception.  As a matter of fact, there have been only 4 times - Super Bowl XXV between the NY Giants and Buffalo Bills, Super Bowl XXXIV between the St Louis Rams and Tennessee Titans, Super Bowl XXXVIII between the New England Patriots and Carolina Panthers and last year's Super Bowl XLII between the NY Giants and New England Patriots, where the losing team's quarterback did not throw an interception.  When you think about each of these four games, one could make the argument that they all rank as the best played, most exciting and closest Super Bowls of all time.  

Starting QBs on the losing teams have thrown a total of 83 interceptions in the 42 games.  This compares to only 22 for starting QBs on the winning teams.  Don't throw an interception, and you give your team an excellent chance of winning the game.  Conversely, throw a pick, and you significantly reduce your team's chances of winning.  

Nine times the losing team's QB threw 3 picks; 4 times they threw 4 picks, and in Super Bowl XXXVII between the Tampa Bay Buccaneers and Oakland Raiders, Rich Gannon threw 5 picks.  That's 14 times in 42 games where a QB threw 3 or more INTs.  There's a 1/3 chance that tomorrow one of the two QBs will throw at least 3 picks!  

Three times the winning team's QB has thrown for more than 2 INTs.  In Super Bowl XIV, Pittsburgh's Terry Bradshaw threw 3, in Super Bowl XVII, Washington's Joe Theismann threw 2, and in Super Bowl XL, Ben Roethlisberger threw 2 INTs.  Note that Johnny Unitas threw 2 INTs in Super Bowl V as a back-up in a win over Dallas.

There have been 9 occasions where the winning team's QB threw for 3 or more TDs.  4 times they have thrown 3, 3 times they have thrown 4, and in Super Bowl XXIV, San Francisco's Joe Montana threw 5 TDs and in Super Bowl XXIX, San Francisco's Steve Young threw for 6 TDs against the San Diego Chargers.  It appears that not throwing INTs is probably more important than throwing TDs.  

The classic example of this is Super Bowl III, where NY Jets' Joe Namath guaranteed a victory against the Baltimore Colts.  He delivered.  He was named the game's MVP.  Most people don't realize that he didn't threw a single TD in that game.  Most importantly, he didn't throw an INT either.  

The table below shows (in my opinion) the 10 best and 10 worst performances by a starting QB over the past 42 Super Bowls.  It's difficult to compare QBs over time.  This is because the most commonly used measure to evaluate a QBs passing performance, the NFL Passer Rating system, has shown that the league average has been increasing over time (see here for the details).  

I have come up with a measure, CMI (Completions Minus Interceptions, calculated as Completion % - 3 * Interception Percentage), which I believe is a better measure of a QBs passing performance.  However, this measure also has the same problem that the Passer Rating formula has, in that it also shows that the league average has been increasing over time (see here).  One way to adjust for this trend is to relate each QBs performance in a Super Bowl to the overall league average for that year.  Then, one can compare that particular measure across Super Bowls.  If I had game-by-game data, I would not only relate the measure to the mean, but would also take into account the standard deviation.  However, in the absence of game-by-game data going back 42 years, the measure relative to the mean shall suffice.  Also, I eliminated from consideration any QB who attempted fewer than 15 passes during the game.

With the technical stuff all out of the way, the single best performance by a QB in a Super Bowl is Phil Simms for the NY Giants in Super Bowl XXI, when he went a near-perfect 22 of 25 with no INTs.  Interestingly, his QB Rating for that game was 150.9, close to the perfect rating of 158.3.  Steve Young's 6 TD effort doesn't show up in the Top 10, although Joe Montana's 5 TD effort does.

At the other end of the Spectrum, Craig Morton's 4 for 15 effort with 4 INTs in Super Bowl XII is the absolute worst performance by a QB in a Super Bowl by any measure.  He also has the dubious distinction of showing up twice on the 10 Worst Performances list.  Ben Roethlisberger, Pittsburgh's QB in tomorrow's Super Bowl XLIII, is the only QB who shows up on the 10 Worst list, whose team actually won the game - Super Bowl XL against Seattle.


Wednesday, January 28, 2009

Super Bowl Squares - Part 2

This post is in response to a comment made regarding my previous post on the same subject.  The suggestion was made that I only consider games since 1994 since that is when the two-point conversion was introduced.  The new distributions are shown in the charts below.  There is also a chart at the bottom that compares the previous chart with the changes to the probabilities.

The distributions did change.  Some of it I'm sure due to the introduction of the two-point conversion.  Some of it is by random chance.  And, the uncertainty has increased.  The # of games over which this data is analyzed is 3,884, a reduction from the 9,509 in the previous sample.  All else being equal, and they are rarely if ever equal, the reduction in the sample size alone creates a larger uncertainty in the estimated probabilities (technically speaking, if the sample size increases by a factor of n, then the error in the estimate increases by the square root of n).

Therefore I'm not necessarily sure that these are better estimates of the true probabilities, but it is now shown.  What we may have gained in relevance, we may have given up in the uncertainty around the estimate.

Saturday, January 24, 2009

Super Bowl Squares

It's that time of the year, when seemingly uninterested people run around feigning interest in what is now a true American celebration - the Super Bowl.  This year's Super Bowl features the Pittsburgh Steelers and the Arizona Cardinals.  Not that anyone outside of Phoenix and Pittsburgh actually cares who is in it.  That however won't prevent Super Bowl parties from cropping up across the nation on Sunday, February 1st.  And, at most of those parties, someone will usually run around trying to get the last few stragglers to participate in the second greatest tradition of the Super Bowl - the "Super Bowl Squares" (the first has to be watching the Super Bowl commercials).  

Basically, for a small donation, you get to place your name in a 10x10 square grid.  For multiple donations, you may be able to place your name on multiple squares.  Before the actual game starts, #s from 0-9 are drawn randomly and placed across each of the 10 columns.  The same process is repeated to the left of each row.  Then the name of team 1 is drawn and placed at the top, and the second team is placed on the left.  Now you have a grid that has each of the possible last-digits of the scores of each team.  When the game ends, you look at the score, then look at the last digits of each team's score, and see whose name corresponds to that permutation and that individual wins a pre-determined amount of the accumulated donations.  This process doesn't have to be limited to the game-ending score.  Many variations exist.  For example, frequently some smaller amounts could be won based on the digit permutations at the end of each quarter.

This year, as I almost always do, I participated in one of these Super Bowl Squares.  I donated for 2 squares.  After all the squares were filled out, the #s were randomly drawn, the team's assigned, the coordinator of the game handed me my sheet.  I had drawn the following 2 permutations:

Arizona 2 - Pittsburgh 5
Arizona 8 - Pittsburgh 2

I promptly tossed the sheet in the recycling bin.

Later on, I decided to see for myself the likelihood of my winning.  

The analysis below shows the aggregated game-ending digit permutations and combinations for every NFL game played in the Super Bowl era, including playoff games.  That's 9,509 games!  That also means that there's a reasonable likelihood that the probabilities shown are close to the true probabilities.  As a matter of fact, every single permutation has been "hit" at least once.  A game ending in the 2 2 permutation has only happened once - on Sunday, December 5th, 2004 the Buffalo Bills beat the Miami Dolphins in Miami 42-32.

First, a little math.  I refer to both permutations and combinations.  There is a difference between the two.  A combination refers to a sequence or collection without regard to order.  A permutation is a combination with a specific order.  Here's an example.  Take what we commonly (and mistakenly) refer to as a "combination" lock.  We say to unlock the lock, "use combination 472".  Well, that's actually only mildly helpful.  Knowing those three #s alone we wouldn't be able to open the lock.  We need to know the specific order of that combination of #s.  In other words, is it 274, 247, 427, 472, 724, or 742.  So in this example, there is one combination.  There are six permutations.

Ok, now on to the tables and charts below.

In TABLE 1, I show all 100 permutations of game-ending scores.  So for example, one can see that the likelihood of the game ending with the winning team's score ending in a 4, and the losing team's score ending in a 3 is 2.94% (to see this, in TABLE 1, go down to the row with the digit 4, then across to the column with the digit 3).  So, this specific permutation has a 2.94% likelihood.  

If you didn't care about whether it was the winning or losing team that had the 3 or the 4 in the last digit, as long as there was a 3 and a 4, then look to TABLE 3.  TABLE 3 shows the probabilities of each combination.  As such, the 3 4 or 4 3 combination has about a 3.67% likelihood of occurring.  CHART 2 graphically illustrates what's in TABLE 3.

TABLE 2 is not meaningful in and of itself, but simply shows the probability of any given digit occurring (note that in this table, the percentages add up to greater than 100.00% since "any 7" will include for example, the "1 7" combination, that will also show up under "any 1").  CHART 1 simply illustrates what's in TABLE 2.

Let's take a look at what my chances are.  It's a little complicated so bear with me.  Remember, I have 2 specific permutations.  Ari 2/Pit 5 and Ari 8/Pit 2.  However, since I don't know ahead of time who will win the game, I need to average the 2 permutations that yield the 2 5 combination for the first scenario and the two that yield the 8 2 combination for the second.  I can do this by either going to TABLE 1, and adding the respective likelihoods of each of those permutations, or I can simply go to TABLE 3.  From TABLE 3 I can easily see that the 2 5 combination shows a likelihood of 0.36% (this is made up by the 5 2 permutation likelihood of 0.23% and adding it to the 2 5 permutation likelihood of 0.13%).  Therefore, the average expectation for the specific 2 5 permutation is 0.18%*.  For the second scenario, from TABLE 3, I can see that the likelihood of this combination is 0.28%, and hence the average expectation for the specific permutation is 0.14%.

* (Technically, I shouldn't be averaging the permutations expectations.  What I should be doing is weighing each permutation by the likelihood of Arizona (or Pittsburgh) winning or losing the game.  So for example, if the likelihood of Pittsburgh winning the game is estimated to be 70%, then a truer expectation for my specific 2 5 permutation might be 0.7*0.23% +0.3*0.13% = 0.20%.  However, if you assume that each team's likelihood of winning the game is close to 50%, then averaging is fine).

So there you have it, the combined likelihood that I would win ANYTHING is about 0.32% (0.18%+0.14%).  

Hence why I tossed the sheet.  Hope you have better permutations!  Good Luck!

Saturday, January 17, 2009

AFC and NFC Championships - Predictions for Sunday

Like I said last week, I can't stay away from building predictive models, so here it is - last week's model applied to this week's championship games - Philadelphia Eagles at Arizona Cardinals, and Baltimore Ravens at Pittsburgh Steelers. I have included other predictions - these are from 3 sites/blogs I frequently follow. I have a lot of respect for the work these guys do. Their models of course, are much more sophisticated than mine - but give me a year, and I'll catch up! Also, I show you what Vegas thinks, just for sake of comparison.

Now, don't get fooled.  Even though we all independently seem to agree on predicting who will win (lone exception being advancednflstats prediction re: Eagles), this does not mean that these teams will win.  There is a big difference between a predicted outcome and actual outcome.  Outcomes in football games are very difficult to predict.  There are few scoring opportunities for each team during a game, and each score has huge variation (generally 3 pts or 7 pts).  There are numerous factors that are simply unpredictable, or, if they are predictable, have a very high degree of uncertainty around them.  In any case, the predictions are below.

Monday, January 12, 2009

A Logical Look at What Happened in the Carolina vs Arizona Game

The Carolina Panthers, played in the toughest division in the NFC, the NFC South.  The Arizona Cardinals played in the easiest, the NFC West.  The Panthers completed a 12-4 season.  The Cardinals were 9-7.  The Panthers were 8-0 at home.  The Cardinals were 3-5 on the road.  The oddsmakers in Vegas decided that this was good enough to favor the Panthers by 9.5 to 10.0 points.

Result - Arizona beats Carolina 33-13

What Happened?
Jake Delhomme threw a career-high 5 INTs.  That was unexpected.  A shocker.  But, how unexpected was it, really?

Let's take a look at some #s.

First, what are the chances of winning a playoff game, when a QB throws X interceptions?  Here are the stats:

0 INTs - 184-51, a winning percentage of 0.783
1 INT - 140-112, a winning percentage of 0.556
2 INTs - 54-117, a winning percentage of 0.314
3 INTs - 17-76, a winning percentage of 0.183
4 INTs - 1-27, a winning percentage of 0.036
5 INTs - 0-11, a winning percentage of 0.000
6 INTs - 0-3, a winning percentage of 0.000

These are based on all playoff games in the Super Bowl era.  The data is courtesy of Cold Hard Football Facts.

As I have shown before, a QBs likelihood of throwing interceptions has been decreasing over time (based on the increasing value of that component in the QB Rating).  So, if one were to look at this season's regular season stats, I'm sure this table would look even more skewed.  (I don't have 2008 game by game data handy).  The point is still the same - throw INTs, and you dramatically reduce the likelihood that your team wins.  The fact that someone threw 5 picks is actually quite irrelevant.  Once you throw 2 picks, you're likelihood of winning is dramatically reduced, and it's less than 1 in 5 if you throw 3.

Ok fine, so let's look at both QBs in this game Jake Delhomme and Kurt Warner.   According to the table below, in Jake's career (including prior post season games), he had never thrown 5 INTs in a game.  However, he had thrown at least 2 INTs in a game 22% of the time, and at least 3 INTs in a game 6% of the time.  What's perhaps most interesting is that Kurt Warner had thrown at least 2 INTs in a game 28% of the time, and 3 INTs in a game 12% of the time.  

So, what does this all mean?  We need to look at the second table (see below the first table for the analysis).

According to this table, based on the likelihoods shown in the previous table there was an 18% chance that at least one of the QBs in this game was going to throw at least 3 INTs (see the last column in the table below).  That's a about 1 in 5.  That's much higher than I would have thought.  That's actually a higher likelihood than both QBs finishing the game having thrown 0 INTs.  Now admittedly, the likelihood that it was Delhomme and not Warner that was going to throw the 3 or more INTs was 2-to-1 against (given at least 1 QB would throw at least 3 INTs).  But the fact is, there was a pretty good chance that this was going to be a game where at least 1 QB threw multiple INTs.  Heck, there was about a 1 in 14 chance that at least 1 of them was going to throw 4 or more INTs - again, much higher likelihood than I would have thought (and you too, I'm guessing).

So there you have it.  An unexpected result.  Certainly.  But not that unexpected.  What should also be clear is that if you throw a pick, you're chances of winning reduce quite significantly.  And for each subsequent one, the odds of winning decrease exponentially.  After 3 picks, it's lights out.


Saturday, January 10, 2009

NFL DIVISIONAL PLAYOFFS - Predicting the Winners

My intention with this blog was not to get into the prediction business, but I can't help myself. It is what I am trained to do. I like building models, and of course, "predicting" is a natural outcome. Although my focus with this blog is quarterbacks specifically, I thought I'd go out on a limb and expose a new game predicting model that I'm working on. Admittedly, my model doesn't have the sophistication of the models built by guys like Brian Burke over at AdvancedNFLStats.com, or Aaron Schatz and his team at the Football Outsiders (see their AFC predictions here, and their NFC predictions here). That all being said, I think I will continue to work on it over time, and, why not, let's put it to the test to see how mine stacks up.


The table below shows my predicted outcomes, assuming each team behaves 'as expected'. Of course, the football isn't round, and funny things happen when the ball bounces. The weather is another element that is not incorporated explicitly into the model.


Arizona at Carolina - All three of us agree that this is the 'easiest' game to predict - with Carolina winning handily. Vegas oddsmakers say a spread of 9.5/10 with the over/under at 48.5/49.0. My model concurs.


Baltimore at Tennessee - Should be a close game. FO calls it for Baltimore, and ANS calls it for Tennessee. My model suggests Tennessee in a close game. Vegas oddsmakers say a spread of 3.0, with the over/under at 34.0/35.0. I agree with the spread, but my model suggests more points (perhaps I need more work on my model)!


Philadelphia at NY Giants - Awfully close game - closest of all the games this weekend - FO on the one hand says Philly, but then he hedges and says its going to be really, really close (a pick 'em). ANS is a little more bullish on the Giants (he also hedges his bet and says, it's a 50-50 outcome if analyzed based on the whole season). My model agrees with both of these models/predictors, and says Philly in a squeaker. Vegas says G-men by 4, with the over/under at 40.0. My model says closer game than that, and a few more points (remember, Vegas needs to take 'perception' into account, as all they care about is getting half the money on one side and half on the other).


San Diego at Pittsburgh - FO says Pittsburgh, and so does ANS. I agree, with my model suggesting Pittsburgh buy more than a field goal. The oddsmakers in Vegas suggest a 5.5/6.0 point game, with the over/under at 37.5/38.0. I think it's a bit closer, but also a few more points.


This is the first public exposure of my model. I'll see how it does this weekend, and continue to build/improve it. Hopefully, at some point in the 2009 season, I'll be comfortable enough to share more details as to the inputs. Suffice it to say that I feel comfortable enough to expose it today.


Let's play the games!


We can evaluate how we did on Monday.

Tuesday, January 6, 2009

Stat of the Week - #1

I'm introducing a new feature - Stat of the Week - little or no commentary - just a stat - mostly because I'm not sure I'll be able to post a blog every week because a) I love my day job, and that takes a lot of my time, and b) posting, especially an interesting, well-researched post, takes time, and I'm just not sure I have the time to do a detailed post every week.  So enjoy....

Sunday, January 4, 2009

Jets-Dolphins - You Could See it Coming

I wanted to post this the day before the Jets-Dolphins season ending game.  However, I was not able to, but, the analysis still has merit.  Going into the game, looking at the way the 2008 NFL season had progressed for both the New York Jets and Miami Dolphins, you could see this coming.  The actual result was not much of a shocker as you might otherwise had been led to believe.  Take a look at the table below - in particular, look at both Brett Favre's and Chad Pennington's last 4 games, and then look at the projected performance.  I really wasn't too far off.  There's no projection that would have suggested that Favre was going to throw 3 picks.  However, you could see at least 2 picks fairly easily, given that in 80% of the games this season he had thrown at least 1 pick, and in those games, he was about 50-50 to throw more than 1 pick.

Happy New Year!

Happy 2009 Everyone!

It's been awhile since I last posted.  I was on vacation.  It made me realize how much I missed my computer, and my EXCEL database.  Anyways, I'm back, and I'll have a post tonight.  I was going to post something before the Jets-Dolphins game the last week of the season, showing how I had 'projected' a bad game for Brett Favre and a good game for Chad Pennington.  Well, my 'prediction' did come true, although my posting tonight on the subject will seem a bit hollow.  That's ok, it's the analysis that matters anyway.

Cheers.