There’s no such thing as advanced sports statistics

While “advanced statistics” are well-ensconced in the baseball world, they are still in fairly nascent stages in the faster-paced worlds of hockey and basketball. For two reasons, baseball is particularly well-suited for this so-called “advanced” analysis: 1) play essentially consists of discrete, one-on-one interactions and 2) a season is long enough to permit the accumulation of a statistically significant number of these interactions, from which meaningful trends can be derived. Hockey lacks both of these characteristics. It’s a fluid sport that rarely features isolated, one-on-one interactions, and numbers people say that the amount of compilable events during an NHL season, which is half as long as a MLB season, are too few to allow for statistical normalization. In other words, the sample size is too small.

Lee Panas’ book on advanced baseball statistics, Beyond Batting Average, which I began reading earlier this year, begins with the deceptively helpful reminder that “[w]ins and losses are indeed what matter.” Statistical data helps to understand why teams won or lost and whether and how they might win or lose in the future.

In the hockey world, advanced statistics, in general, aren’t too advanced just yet, at least when compared with the baseball sabermetric world. At present, the central concept is that, because goals– an obvious leading indicator of success (i.e., wins)– are too rare to be statistically useful, advanced hockey statistics orient themselves around possession. Because it is somewhat difficult, from a practical standpoint, to measure time of possession with useful precision, however, the leading metrics, known as Corsi and Fenwick, simply track those things a player and his team can do only when they possess the puck, which essentially amounts to shooting it.

If you prefer an expert with a more conversational style, here’s Grantland’s Sean McIndoe

Corsi and Fenwick are two of the terms you’ll hear most often as you’re faking your way through the world of advanced stats. They measure essentially the same thing: how many shots are directed at each net at even strength.

Note that that’s “shots directed at the net,” not just plain old shots. That’s because we’re including traditional shots (i.e., goals and saves) as well as shots that miss the net. Corsi also includes blocked shots, while Fenwick doesn’t. (Yes, there’s a reason for that. No, you will never need to actually know what it is.)

You can express Corsi and Fenwick as a plus/minus or a percentage, and apply them to either a team or an individual player. If a team is taking more shots than it’s allowing, that means it’s (very probably) possessing the puck more than its opponents, which means that (all else being equal) you’d expect it to score more goals than the other team.

This is where you’ve probably started mumbling about shot quality, since Corsi and Fenwick and any variations seem to assume that all shots are created equal. That’s obviously not true, since a breakaway or a goalmouth tap-in are more likely to result in a goal than an unscreened shot from the point. But as it turns out, there’s little evidence that teams or players can get consistently outshot while still getting better chances. And while there have been efforts to track scoring chances as opposed to just shot attempts, the results usually just wind up mapping pretty well to each other. So over the long term, we can largely ignore shot quality — not because it doesn’t exist, but because it tends to even out over time.

Corsi and Fenwick turn out to be great predictors of future success. Learn to love them.

Practical tip: As a side note, Corsi and Fenwick are both named after people. They’re not acronyms. This doesn’t sound important, but it is. If you ever write “CORSI,” someone will immediately point at you and yell “Fraud!” At this point, they’ll quite possibly stab you.

(emphasis added).

On an accelerated pace befitting these internet times, hockey fans and commentators followed the path laid out by their baseball counterparts over the past decade: while some endeavor to develop new and improved ways to measure performance, others resort to throwing rotten fruit at the innovators in an attempt to draw attention to themselves.

As an example, here’s Daniel Wagner, who definitely is not a fan of the wave, discussing the latter group:

[T]his is also one of the biggest problems that people seem to have with so-called advanced statistics: they’re almost entirely reliant on counting shots. Corsi and Fenwick are both shot-based statistics that are pretty much the opposite of “advanced.” All they are is adding and subtracting shots. The more shots for your team and the fewer shots against, the better. Outshoot your opponent enough, particularly at the right time of the game (such as when the score is tied or within one goal), and you’ll win a lot more games than you lose.

If this seems like an incredibly simplistic view of hockey, that’s because it is. It’s also a completely inaccurate view of hockey. That isn’t to say that Corsi and Fenwick aren’t useful, because they certainly are. As Cam points out, shot-based analytics have impressive predictive power. But they also are coming at hockey from the completely wrong end.

I believe this is part of the reason why so many people are resistant to shot-based statistics. What matters is winning, winning requires goals, and a high volume of shots does not, strictly speaking, create goals. Shots are a by-product and not a cause.

But there’s still something about saying that more shots is better that rankles. When you watch a game, low percentage shots with no chance of going in certainly don’t look like they’re helping a team win. A defenceman that constantly misses the net is infuriating. I’ve even had discussions with people who suggest that players could start gaming the system if they knew their coach or GM used Corsi by intentionally taking more low-quality shots instead of looking for a better scoring chance. When you watch a hockey game, you can see that not all shots are equal, but Corsi and Fenwick treat them as if they are.

You don’t go to a hockey game to see your team take a lot of shots; you go to see your team score goals.

(emphasis added).

First, shots are a cause of goals, perhaps the leading one. Second, and more important, the reference to “gaming the system” illustrates that the speaker is missing the mark.

As mentioned above, Corsi and Fenwick are doubly derivative: a shot, even a bad one (or, in the case of Corsi, a blocked one), is an indicator of possession, and possession, in turn, is an indicator of winning. The shared thesis of these “advanced” statistics is not, as Wagner’s article suggests, that shots = wins, but, importantly, that shots = possession = wins.

The rejection of “advanced” statistics on grounds such as the one put forth in the article quoted above reduces to an opinion that the measured event (shots, broadly defined) is too far afield from the ultimate goal (i.e., winning).

The problem for the traditional crowd is that the traditional metrics usually are at least as many steps removed from winning as the ones bearing the “advanced” label. Take baseball: who could accept batting average but reject on-base percentage? For hockey, why accept plus/minus but reject Corsi or Fenwick?

The “advanced” label is misleading (and unnecessarily divisive) because it suggests it applies to something meaningfully different, when, instead, it is merely new. All of the sports statistics we have at our disposal, both the old and the new, have the same fundamental nature and purpose: they all are intermediate proxies for ultimate success. The only question about a given metric should be the degree to which it is a proxy for success, and there’s nothing “advanced,” “fancy,” or especially “traditional” about that. It’s merely a question, and these are merely different ways of trying to measure the same thing.

______________________________________________________________

Related
Bouncing puck: Passing, not shooting, is the key to scoring on the ice and the hardcourt

Advertisements

6 thoughts on “There’s no such thing as advanced sports statistics

  1. Pingback: Baseball blogger proposes extremely traditional training tactics | ALDLAND

  2. It seems to me that if a coach is deciding a player’s worth based on Corsi/Fenwick then that coach probably isn’t worth playing for. It seems to me that a useful hockey statistic is +/- time the puck is in the offensive zone/defensive zone. But since there probably aren’t numbers on this readily available, pucks shot towards to goal probably correlates strongly with it. That is, only when the puck is in the offensive zone can a high quality scoring opportunity (whatever the heck that is) arise (not accounting for a big breakaway which are pretty exceptional cases).

  3. Pingback: Taking a pass on new hockey statistics | ALDLAND

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s