View Single Post
07-28-2012, 12:07 AM
[insert joke here]
cptjeff's Avatar
Join Date: Sep 2008
Location: Washington, DC.
Country: United States
Posts: 10,751
vCash: 450
Originally Posted by Polansky View Post
I also think it will be very difficult.

Just imagine the following simple situation. Patrick Kane comes down the left side of the rink, the defence plays him perfectly, they keep him to the outside and allow a weak shot from almost the corner. A stick even gets in there causing the puck to slide weakly on the ice. Great defence from all five players on the ice. Oh no! The puck goes in. Game over! Stanley Cup goes to Chicago.

In real life, a team played good defence and really worked hard to keep a dynamic player to the outside, but on the stat sheet, all it says about that shift is one goal against.

I think finding a way to take that situation and fairly put that into a statistic is going to be really difficult. That goal is entirely on the goalie, but there is no way to know unless we have human's rating the difficulty of a goal and shown by UZR in baseball, the second humans have to rate something, there are problems.
Another issue is the relative paucity of quantifiable events. When you're manipulating stats, it's best to go off hard data where possible. In baseball, you get a pretty solid point of data with every pitch. Yes, it's not absent complicating factors, but there are far, far, less then there are in hockey. For every shot, there's about a minute and a half of play in an average hockey game, much of which can't be quantified. And the shots themselves are wildly different in situation and quality. It's difficult to factor that in.

When you build advanced stats on a foundation so weak, your uncertainty multiplies, and in some of the stuff I've seen trotted out, the uncertainty is multiplied to the point where the stat is utterly useless.

Some of the stats may be valuable, but I'm very dubious of the category as a whole. Going back to baseball again, with those advanced stats the external factors to each result can be assumed to cancel out over time or be so small as to be insignificant over a large sample. Think Newton's Laws and relativity. Yes, relativity always applies, but until you hit .1c, you can just zero out the effect with no consequence unless you're using more than something like 5 significant figures. That's baseball. The external stuff can be worked around or ignored without much consequence. With hockey, the factors you can't quantify are much more significant. When you have 20 significant variables and only 5 values, you can't solve the equation. It's simply impossible, but that's what a lot of these advanced stats try to do, and more often then not they seem to do it by simply deleting 14 of the variables and pretending that they don't exist.

cptjeff is offline   Reply With Quote