Let’s say that you are one of the panelists on the Harris Poll or Coaches Poll, and your job is to rank the Top 25 Division 1 college football teams each week. How are you going to find time to watch all 50+ Division I games this week to make an informed vote?
The answer is, you probably won’t:
“It is a common allegation that some poll voters simply look at box scores before punching in their votes. These votes have a huge impact on who goes to BCS games, including the national championship. Only by watching the game or game tape (or by careful box-score scrutiny) can a coach determine if a 49-21 score was because of a fairly one-sided game, or because the winning team tried to make the score look more impressive when the game’s outcome was no longer in doubt.”
I hate seeing a team’s hard work put in the hands of a bunch of judges, as if college football were actually figure skating or American Idol. But when the judges don’t even watch the games? Maybe we can help these panelists visualize the box scores of the games to quickly see at a glance how close the games actually were. Charts and Graphs to the rescue!
Charting the “Game Flow” — the lead over time
A first glance at the the NCAA Top 25 scores from week 12 (11/20/2010) shows that there were some big wins, like Alabama defeating Georgia State by 56. There were also some squeakers, like Ohio State beating Iowa by only 3. But were the final scores indicative of how close the games were? Did Alabama run up the score at the end? Did Ohio State cruise with a comfortable lead, only to give up cheap points at the end?
Let’s visualize the lead over the course of these games to see if we can get a sense of the extent of the victory. We’ll begin with the Oklahoma State defeat of Kansas, 48-14:
This chart shows the size of Oklahoma State’s lead at every point in the game. The ultimate margin of victory was 34 points, a very sound defeat if you are looking at the box score. But if you look more closely at the lead over time, the game was very close at halftime: just a 6-point game (20-14). Oklahoma State started running up some points starting mid-way through the 3rd quarter, but the outcome was anything but certain up until then.
Calculating the “Average Lead”
As a contrast, let’s consider another game that also had a 34-point margin of victory: Stanford over California (also with a final score of 48-14). In this game, Stanford actually had a huge 31-point lead at halftime, and cruised easily in the second half. In fact, without a score by California with 15 seconds left, the game would not have even been this close. So two games with the same final margin of victory, but clearly Stanford dominated more than Oklahoma State.
Is there a way we can describe how different these two games were? What if we tried to calculate the “average” lead for the game; that is, an average of the lead (for the winning team) at any given moment in time. Mathematically, that amounts to the “area under the curve” of the plot of the lead. Here is what those two games look like side-by-side, with the average lead plotted on top of it:
Even though the margin of victory was 34 points in both cases, Stanford actually had an average lead of 24.8, whereas Oklahoma had an average lead of only 10.44. Stanford’s larger lead over a larger portion of the game is reflected in its average lead being larger than Oklahoma State’s.
Here is what the game flow and average leads would look like for all games in he Top 25 from week 12:
So, back to the Alabama game: yes, it looks like they did dominate. And as for Ohio State: yes, it was a scraper (in fact their average lead was below 0!). But let’s imagine we are a panelist for the Harris Poll. Do we really want to analyze 50 gameflows to determine whether a game was more or less dominant than the final score would indicate? Isn’t there an easier way to scan these scores?
Margin of Victory vs. Average Lead
Here is a scatter plot of the final score vs. the average lead for each of these games, with a linear regression fit line added. The more above that line a game was, the more dominant the win was for its margin of victory. The more below the line it was, the more it just squeaked by. We can see that the Stanford and OK. State games are both aligned vertically at margin of victory 34, but Stanford had a much higher average lead. Similarly, Missouri’s (MIZZ) defeat of Iowa State aligns with Virginia Tech’s (VT) win over Miami at a margin of victory of 14, but it appears that Missouri had a much higher average lead (7.1 to Miami’s 1.0.). Indeed, if you go look at the gameflow graphs for those two games, you see that the VT game was much closer–it was tied as late as the 4th quarter.
So, looking at both margin of victory and average lead appears to have quickly identified a game that was not as close as it appears. But the system isn’t perfect: compare South Carolina’s (SCAR) defeat of Troy by 45, with Nevada’s 46-point defeat of New Mexico State by 46. The games had similar margins of victory, yet South Carolina had a larger average lead. But, if you go back to the two gameflows, I couldn’t really say one game was more dominant than any other.
Comparison to the NFL
For fun, let’s look at the NFL to see how the relationship between average lead and margin of victory compares to the NCAA. A high correlation between average lead and margin of victory would mean that the final score typically indicates how close the game was; a low correlation would indicate that it does not, and that the game was probably more interesting than the final score would suggest.
So here are the gameflows for the NFL, Week 11:
And here is the scatter plot of the margins of victory vs. the average lead:
It seems that there is much more “scatter” in the NFL games; that is, average lead is less correlated with margin of victory. A linear regression analysis seems to confirm that:
Residual standard error: 4.229 on 15 degrees of freedom
Multiple R-squared: 0.3068, Adjusted R-squared: 0.2606
F-statistic: 6.639 on 1 and 15 DF, p-value: 0.02105
Residual standard error: 5.535 on 15 degrees of freedom
Multiple R-squared: 0.834, Adjusted R-squared: 0.8229
F-statistic: 75.35 on 1 and 15 DF, p-value: 3.108e-07
The NFL had an R-squared of only 0.2606 to the NCAA’s 0.8229. So, it appears that at least for this week, scanning the final box scores in the NCAA actually is a fairer indication of game dominance than the NFL.
Gameflow is a quick and easy way to see how dominant a team’s victory was. You can also quantify a team’s “average lead”, which may be useful in conjunction with margin of victory to determine the extent of the victory. Demonstrating this statistically would require more work, including looking at a wider range of scores, comparing regression models, “testing” the results versus a human assessment of how close the game was.
And full disclosure — yes, my bitterness about college football may come partly because I am a fan of the Ohio State Buckeyes, perennially overrated and most likely to embarrass on a national stage. But still, it irks me that one of the finest sports ever created forgoes a system where winners advance and losers go home, and instead becomes all about style points. Would using gameflow charts or average lead make me any happier about the current system? Not really, but at least it helps address one of the million reasons we all hate the BCS system.
But the gameflows charts are a fun way to see at a glance what happened in football last week. As long as I don’t look too closely at my Cincinnati Bengals’ collapse against Buffalo.