Wednesday, February 4, 2015

The Best in Research Narratives: Let's Talk Football

I’ve written in the past about journalists who get data storytelling so, so wrong.  But every now and then, they get it so, so right. As we reflect back on the recent Super Bowl hoopla, from the Patriots’ Deflategate, to cries for Pete Carroll’s resignation, it seems a great time to  shed light on two sports pieces that caught my attention as brilliant pieces of data storytelling.

First up: that New England team with the deflated balls. Slate did a spectacular job of building a narrative that makes you think, “Dear lord, have they been doing this for 8 seasons? And is Tom Brady behind it, even if unwittingly so?”

You can click here to read the article. Here’s what works so nicely about it.
  • It starts with a hypothesis – that if the Patriots cheated, we might see a difference in their fumble performance before vs. after 2006 rule change on which team provides the game ball. Yes, it’s fundamental hypothesis testing! In a sports news article!
  • It builds the story around a clear metric – fumbles (turnovers) – and explains logically why this metric is the most valuable one to look at.
  • It offers compelling data driven evidence,  that controls for potential covariates – including overall changes in league trends, the impact of individual players, and the confound of indoor stadium settings. It’s like this guy actually took a stats class, or five.
  • It visualizes that data in a way that makes the pattern alarmingly obvious.
  • It offers comparisons - actually educating readers by explaining probability, and what distribution we’d expect to see if there weren’t something fishy taking place.
  • It explains the outlier data – in this case in a way that serves to support the thesis that, well, the Patriots may have cheated.
  • It uses classic storytelling tactics: much like a movie, we have a hero (in this case, a champion quarterback) who might be a potential martyr and catalyst for this conflict. Thank you Tom Brady.

I’m not saying the Patriots cheated, and frankly, I’m not enough of an NFL fan to care. But I will say that whether or not they did, this analysis of the situation provides a compelling argument that they did something different that they’re not admitting. And that’s something investigators would be negligent to ignore. Bravo, Slate.

Next up is Pete Carroll’s 1 yard line call, AKA: the call that every Monday Morning Quarterback protested. Except my pal Rob Pait (read his clever take here) and FiveThirtyEight.com.

If you like data storytelling, you should probably just go ahead and bookmark FiveThirtyEight.com right now. Lots of people found it necessary to pontificate about the risk of Carroll’s pass play, without actually providing any data to back up that risk assessment. Five Thirty Eight actually bothered to do some math.  [insert slow clap here.]

Here’s what I love about this analysis, which concludes that the perceived risk of the ultimately wayward call is far greater than the actual risk.
  1. It starts with a great, contentious, catchy headline. I know, clickbait, blah blah. But they actually come back to the headline and address it in the end. And even though questioning Belichick is really a secondary point, they do eventually make that point, while making an even better one along the way.
  2. The article starts off by painting a gripping picture of that last minute in the game, making you feel the urgency and pressure of the situation. It’s great writing. And can we, for just a moment, congratulate statisticians on being capable writers? Thank you, data analysts of Five Thirty Eight, for taking the time to learn communication skills. And thank you, journalists, for learning how to analyze and interpret data. May others follow in your footsteps.
  3. It provides a hilariously clever villain: Harvard. The article tees up two data tweets from Harvard Sports as darn compelling. And then slowly, meticulously, the author unravels how misguided the tweets really are when you don’t look at the entire context. I should confess that I went to Cornell, and thus embrace any chance to snort in the general direction of Harvard sports. But even if this had been my own alma mater, I would have been impressed by the way the writer set the stage and then tore it down.
  4. It addresses what most other analyses did not, namely the actual success stats of running vs. passing, and the objective of not only scoring, but running down the clock. Yes, there’s some estimating that needs to be done here, but the estimations are better than just blissfully ignoring these contextual confounds.
  5. It mixes qualitative and quantitative assessment, with a touch of game theory. In this world of data analytics, we sometimes forget that the most powerful research insights are often gleaned from combining methodological approaches.  What makes this article so powerful is that it doesn’t choose one type of method – it uses them all. It constructs an argument that thinks through the “if then” Boolean logic that a coach would utilize. It qualitatively assesses the pros and cons of those alternatives. And then it puts some data behind those arguments.

Back in college, my academic adviser talked me out of taking a media research class because it was “beneath me.” I thought it was terrible advice even at the time, so I took journalism instead, and then went on to roll my eyes as I started a company specializing precisely in the topic of media research . But that’s exactly why I love these examples of truly compelling data storytelling – it overcomes the “us vs. them” nonsense that many data-oriented academics will espouse to impressionable students who nobly want to learn to communicate.

So, thank you Slate and Five Thirty Eight. May thoughtful data storytelling and true research narratives become the way of the future.