My Thoughts‎ > ‎

Updating Player Usage Charts

posted Apr 3, 2016, 3:48 PM by Robert Vollman
Why are the Hockey Abstract player usage charts out of date, and when will they be updated?

As explained right below the title on those charts, the data comes from the Behind the Net website. Robb Tufts routinely gets an updated copy of the data from Behind the Net, and updates and formats the Tableau visualization in question, which is a product of his own design and programming (though I like to think that I've had some influence). However, there have been no updates in almost two months.

So why isn't the data on Behind the Net being updated?

All of the data for Behind the Net, and virtually every other hockey stats website out there, comes directly from parsing the NHL's game files. You may remember that the NHL re-designed its website February 1, right? Well, that affected the location and format of those data files, which "broke" some of the automated feeds.

Since then, some developers have had the opportunity to update the programming behind their own hockey stats website to reflect these changes, but that's not always feasible. It takes a lot of time to design these websites, and everyone is doing it as a spare-time hobby, on a volunteer basis. 

You can usually count on people to find the spare time at some point in the summer, but mid-season format changes are a great way to submarine most of the websites that are being managed by people with families and full-time jobs.

Can I get the data somewhere else in the mean time?

Not at the moment, no, not all of it. 

Player usage charts use three primary pieces of information; Relative Corsi (aka shot attempts, SAT), zone start percentages, and quality of competition. The first two are available in a number of different places in one form or another, but not quality of competition.

What is quality of competition? 

Essentially, we know roughly how many minutes and seconds that every player has spent on the ice against every single opponent, based on information contained in the NHL game files. You can estimate the average level of competition a player faces by using that information to calculate a weighted average of his opponents for a particular statistic. For example, you could use scoring rate, plus/minus, ice time, or whatever you want.

For these player usage charts, a player's quality of competition is based on the weighted average of his opponents Relative Corsi (which is simply the team's shot attempt differential per 60 minutes when that player is on the ice, minus the same thing when he's not).

This particular variation of quality of competition is only available at Behind the Net, nowhere else. In fact, this statistic was innovated by Gabriel Desjardins himself, who runs Behind the Net. In double fact, unless Hockey Analysis or some other now-defunct website beat him by a narrow margin, I believe that Behind the Net was also the first website of any kind to provide information like Corsi, zone start percentages, and quality of competition. 

Can you parse the data and calculate it yourself?

Wow, no. I know how to do it, but that's a lot of work, and I have my hands full as it is.

Ok, how about someone else?

Robb Tufts has chatted with a few other developers to see if anyone else has plans to add this statistic to their website, and has contemplated doing it himself, but everybody is stretched on time. 

The fact that this statistic has existed for years, and that so many websites have come and gone in that time without including quality of competition, means that I wouldn't count on it happening any time soon.

How about using one of the other variations of quality of competition?

There is one possibility. War on Ice has a variation of quality of competition (and teammates) that's based on the weighted average ice time of a player's opponents, instead of on Relative Corsi. It was an idea first advanced a few years ago by Eric Tulsky, now of the Carolina Hurricanes.

Both versions produce very similar results. The only real difference is that sometimes the best players don't get the most ice time. In such cases, the time-based variation will slightly favour those players who are taking on the big-name, big-minute players, while the original Corsi-based version will slightly favour those taking on the most effective players, whether they're on the top lines or not. 

However, the differences between them are almost negligible, and the substitute is perfectly satisfactory.

Even if we went this route, there are a few problems. First, it would take time to program the player usage charts to get the data from somewhere else. Furthermore, such a change would be redundant, since War on Ice already has a player usage chart tool of its own, which does indeed use the time-based version of quality of competition.

Finally, the designers of War on Ice have been hired by NHL teams, Andrew Thomas and Alexandra Mandrycky by the Minnesota Wild, and Sam Ventura by the Pittsburgh Penguins. As such, the website is no longer being maintained, and may some day disappear altogether, just like Extra Skater disappeared when Darryl Metcalf was hired by the Toronto Maple Leafs in August, 2014.

So what should we do?

Be patient. The player usage charts aren't likely to be updated in a timely manner, but they will be updated. Eventually, Gabriel Desjardins will have time to update Behind the Net, or someone else will add his metric to their website, or some other solution will present itself.

In the mean time, use the War on Ice player usage charts, stay tuned for updates, and remember the importance of supporting those who create these innovations and/or who make them available to others. In Desjardins' case, his favourite charity is Education in Need for El Salvador, and in my case, I appreciate all book sales.