By The NumbersHockey Analytics... the Final Frontier. Explore strange new worlds, to seek out new algorithms, to boldly go where no one has gone before.

Introducing a new stat: Location Adjusted Expected Goals Percentage

I've always wanted to see basically this exact stat done but I'm not savvy enough to do it myself. Especially the heat map, that's awesome.

One thing that struck me as odd was that even the top players had a LAEGAP of under 50%. As the sample sizes increase those should go up right? I think I'd be more interested in seeing the data without having it adjusted down due to degrees of confidence - maybe with a filter of I certain minimum number of events.

One question though; it seems like you have based the metric on shots, wouldn't it have been better to base it on Corsi/Fenwick instead?

1. You'd get a larger sample size for each player.

2. Just because a shot from an area is more likely to go in it doesn't mean that a shot attempt is more likely to do the same. There might be a higher frequency of blocks, or harder to hit the net from that area.

Point #2 is likely pretty (or completely) insignificant. But there is a reason Corsi and Fenwick are preferred over shots, shouldn't you have used one of/both of them instead?

Missed/Blocked shots don't have their locations recorded.

Also, to everyone asking for the percentages+intervals, I don't have the data right now as I'm out of town but I'll update the article to include the numbers when I get home.

Missed/Blocked shots don't have their locations recorded.

Also, to everyone asking for the percentages+intervals, I don't have the data right now as I'm out of town but I'll update the article to include the numbers when I get home.

Well, that settles it!

Are you going to publish the league wide data, and are you going to continue calculating and posting LAEGAP updates during the season?

I'm curious about something - how does the distribution of players to teams work out? Just on a glance for example of your top 30 - there's a couple teams with 6 players represented, several others with 3 or 4

Seems like it might be a stat heavily influenced by the team playstyle?

I've always wanted to see basically this exact stat done but I'm not savvy enough to do it myself. Especially the heat map, that's awesome.

One thing that struck me as odd was that even the top players had a LAEGAP of under 50%. As the sample sizes increase those should go up right? I think I'd be more interested in seeing the data without having it adjusted down due to degrees of confidence - maybe with a filter of I certain minimum number of events.

Great work though. Keep it up.

The reason the numbers are low is because he was using the lowest possible number in the interval.

For example, Dan Boyle's number may have been 53%, plus or minus 3.7% meaning it could lie anywhere in the range of 49.3% to 56.7%, but Wesleyy is simply reporting the lowest possible value.

Based on the data in the article the +/- 3.7% is dramatically below the margin of error seen since it appears that Boyle's LAEGAP from the numbers is actually 37.9/(37.9+23.4) = 37.9/61.3 = 61.8% which, to give him the 49.3% shown as the lower bound would make the margin of error 12.5%.

Now, I don't have the full data so I may be incorrect, but this is just to explain why all the numbers seem low.

As he said:

Quote:

To allow for easy sorting, we will take the lowest value possible in the interval. By taking the lowest possible value we will undervalue a player, particularly low event ones, much more often than we will overvalue them, which I think is the better of the two.

Regarding the author and your publication of this: I first want to say great job. As others have said, it's extraordinary to do something like this at 17. I'm not sure what you want to do as a career, or where you want to go to university, but if you're interested in something with a math background, make sure to tell a recruiter about this.

Regarding the actual metric: I find it very interesting and it's certainly something worth looking into further, especially when combined with other metrics. Before something like this I could compare (over larger sample sizes) CORSI to GF% to try and get an idea if a player was helped or hindered by variation in shooting percentage. Something like on ice shooting percentage can help with this as well.

LAEGAP goes a step further and can help me answer a question like "Is player X really suffering from (causing?) poor shooting while on the ice, or does his presence merely increase the likelihood of bad shots being taken?" Scott Gomez is notorious for producing good corsi numbers but terrible on ice shooting percentage numbers. This metric would help us to see why there's such a discrepancy. Have he and his teammates simply shot poorly while he's on the ice, or do they simply take more low percentage shots?

As others have noted, it would be interesting to expand this to fenwick or corsi events (looking at corsi shooting percentage of fenwick shooting percentage) although current data collection does not allow this. The beauty of developing a framework like this is that it would be fairly simple to plug in the data if/when it becomes available.

I'm glad to see that you reviewed repeatability. That's one of the two key things I look at for a stat in regards to its predictive power. 1.) Can players repeat it? and; 2.) Does it correlate with winning? Question 1 basically asks if it has any predictive power. If you can't repeat a performance of a given statistic then it doesn't do much to tell me about your future results. Question 2 basically asks if it's useful (or harmful) to be good at a stat. Just because I can repeat a statistical performance doesn't mean it helps my team.

Now, by very nature of being something based on expected goals, one would expect that LAEGAP would have a positive correlation with winning, but I'm curious if you looked at this at all. Did teams with strong LAEGAP, or large numbers of good LAEGAP players, tend to fare better than those with weak LAEGAP or low number of good players?

And a quick technical note: On your chart in the LAEGAP vs corsi section you switched the games played and LAEGAP columns.

I'm curious about something - how does the distribution of players to teams work out? Just on a glance for example of your top 30 - there's a couple teams with 6 players represented, several others with 3 or 4

Seems like it might be a stat heavily influenced by the team playstyle?

Given that we're dealing with events of the player and the other 4 on his team while he's on the ice you would expect players who play together a lot to show similar numbers.

For example, if two players played every second together, never spending a second on the ice without the other, they would be identical from a corsi, qualcomp, LAEGAP, etc. perspective.

Seeing combinations like Pacioretty and Gallagher, Toews, Hossa and Saad, and Seguin and Bergeron isn't that surprising given that these players spent large amounts of time on the ice together and thus would have a large number of data points in common.

Fantastic! It would be great if this stat was portrayed in a team setting like Corsi and Fenwick do. I know the Leafs get shafted by Fenwick and Corsi stats, because they give up a tonne of shots from the outside low % areas.

Awesome work. As other have said, something like this has been a long time coming.

I was wondering how significant of a difference it is we are seeing between the players (this has kind of already been brought up). You've got Boyle's lower bound at 0.493526, then Gallagher 0.000644 below him, then at the bottom of that list is Subban who is 0.0524 below Boyle.

While the highest lower bound number could imply the player with the most "reliably" high "true" number, it doesn't change that Subban's "true" number could just as easily be the highest given how much their intervals overlap. It's essentially randomness in between.

I have very little Bayesian training, but matnor may have been on to something with that? The goal may simply be how confidently you can suggest a player is above average (your prior).

You may have acknowledged this and I missed it (though I could be seriously misstating some things as I'm tired and haven't brushed up on my econometrics in some time haha, I almost went on a semi-related confidence intervals rant, but realized I was fudging way too much in my response haha).

It may simply be the case you'll never have enough of a sample to improve your accuracy. Perhaps look at player's multi-season samples? Baseball defensive stats have similar issues (check out UZR if you haven't already) and a common rule of thumb is you need 3 seasons of data before you can infer anything.

Another thought is you could try to identify players in "batches". What group of players you can confidently declare "elite" (say that whole group you've posted so far), "good", "average", etc. Perhaps I'm mislabeling your intentions here, as I doubt anyone would now be definitively saying Boyle > Gallagher > Clarkson... It just becomes a question of "usefulness" for evaluating an individual player when it's all essentially random.

I'm curious what kind of numbers you have for the worst in the league?

Awesome work. As other have said, something like this has been a long time coming.

I was wondering how significant of a difference it is we are seeing between the players (this has kind of already been brought up). You've got Boyle's lower bound at 0.493526, then Gallagher 0.000644 below him, then at the bottom of that list is Subban who is 0.0524 below Boyle.

While the highest lower bound number could imply the player with the most "reliably" high "true" number, it doesn't change that Subban's "true" number could just as easily be the highest given how much their intervals overlap. It's essentially randomness in between.

I have very little Bayesian training, but matnor may have been on to something with that? The goal may simply be how confidently you can suggest a player is above average (your prior).

You may have acknowledged this and I missed it (though I could be seriously misstating some things as I'm tired and haven't brushed up on my econometrics in some time haha, I almost went on a semi-related confidence intervals rant, but realized I was fudging way too much in my response haha).

It may simply be the case you'll never have enough of a sample to improve your accuracy. Perhaps look at player's multi-season samples? Baseball defensive stats have similar issues (check out UZR if you haven't already) and a common rule of thumb is you need 3 seasons of data before you can infer anything.

Another thought is you could try to identify players in "batches". What group of players you can confidently declare "elite" (say that whole group you've posted so far), "good", "average", etc. Perhaps I'm mislabeling your intentions here, as I doubt anyone would now be definitively saying Boyle > Gallagher > Clarkson... It just becomes a question of "usefulness" for evaluating an individual player when it's all essentially random.

I'm curious what kind of numbers you have for the worst in the league?

Anyway, again, great work.

Lower bounds of confidence intervals are just an easy way to sort the players that allowed me to give more credit to players that are trusted with more ice time (repeated performance on a larger sample size). I completely agree that you can't say that Boyle is definitively > Gallagher, but can you do that with any other statistic? Besides, you could easily go back to simply using GF%, or even GF differential, which is essentially the same as Corsi.

Couple overall comments. First, the rink effects on shot locations are non-linear so that the mean adjustment is generally inadequate for rinks. Might want to look at the Total Hockey Ratings (THoR) paper of Schuckers and Curro to see another possible adjustment. Second, as others have noted there is very likely a team effect (outside of rink) in your ratings that will need to be accounted for. You also should take a look at the Expected Goals model of Brian Macdonald as reference point for another expected goals model. Third, it was very nice of you to do the year to year correlation. More years would be very useful, especially considering that 2012-13 was a shortened year.

The reason that shot quality (which is what you're after here) is not utilized more is that its effects have been fleeting. I'm not one to total exclude the effects of shot quality (or average shot probability) but skepticism in this area is well warranted.

Some technical details: Wilson's BPCI is not the one you want here (it's not appropriate for the type of data that you have) nor is it the one that is used in political polls. A (Bayesian) alternative would be to take (EGF + k)/(EGF+EGA+2k) where k is some number such as k=5 or 10 which would 'shrink' estimates to 50%. You might have to select k based upon a full season rather than on '12-'13.

Fantastic! It would be great if this stat was portrayed in a team setting like Corsi and Fenwick do. I know the Leafs get shafted by Fenwick and Corsi stats, because they give up a tonne of shots from the outside low % areas.

I think this is rather gamebreaking.

The difference in mean shot distance (for versus against) at even strength for TOR last year was, like, a foot or a foot and a half.

And some of that gap is, of course, a reflection of good fortune.

Couple overall comments. First, the rink effects on shot locations are non-linear so that the mean adjustment is generally inadequate for rinks. Might want to look at the Total Hockey Ratings (THoR) paper of Schuckers and Curro to see another possible adjustment. Second, as others have noted there is very likely a team effect (outside of rink) in your ratings that will need to be accounted for. You also should take a look at the Expected Goals model of Brian Macdonald as reference point for another expected goals model. Third, it was very nice of you to do the year to year correlation. More years would be very useful, especially considering that 2012-13 was a shortened year.

I agree that the rink bias is non-linear, but I wasn't aware of the alternatives that you mentioned. I'll look into those.

The team effect could be easily removed by using a relative variation similar to Corsi so it's not a big deal.

Quote:

Originally Posted by schuckers

Kudos for all this work.
The reason that shot quality (which is what you're after here) is not utilized more is that its effects have been fleeting. I'm not one to total exclude the effects of shot quality (or average shot probability) but skepticism in this area is well warranted.

I actually have looked into many articles/studies that claim shot quality is more luck than skill, but almost everyone I've found have a flaw of some kind in their study and they also fail to explain how players are consistently able to produce the same offensive numbers in the same offensive opportunities season after season if scoring is really a roll of the dice.

Quote:

Originally Posted by schuckers

Kudos for all this work.
Some technical details: Wilson's BPCI is not the one you want here (it's not appropriate for the type of data that you have) nor is it the one that is used in political polls. A (Bayesian) alternative would be to take (EGF + k)/(EGF+EGA+2k) where k is some number such as k=5 or 10 which would 'shrink' estimates to 50%. You might have to select k based upon a full season rather than on '12-'13.

I'm assuming a bpci is not suitable because my datatype is not exactly binomial? I'm still in highschool and my school doesn't even have a statistics course so I'm applying what I know from reading wikipedia articles, and a bpci seems to be the best solution that I know of. I know BPCIs aren't used in polls but I just wanted to familiar the reader with CIs which is why I chose to incude that part.

I'll look into the Bayesian alternative and more years of correlations when I have the chance to revise the article. Right now I've got a long list of study ideas lined up that I want to look into so it'll probably have to be after I run out of ideas.

Thanks for the comments, I really appreciate that a leader in the field like yourself took time to help me out.

Reading that blog, I don't see the errors as being as large as the blogger claims.

Plus, the foundation of the blog's claim is that the other measurement is perfectly accurate (which it certainly can't be). The fact that the two measurements differ isn't 100% the fault of the NHL's tally.

Beyond that, let's suppose that the measurement is as inaccurate as the blogger claims - it's still pretty good. What's a better analytical tool; one that's slightly off (but still gets the general gist that shots closer to the net are better scoring chances) or one that treats all shots on goal as equally likely to produce a goal (as CORSI does)?

I want to cry! This is beautiful work. And you're only 17!

I've often said a "perfect" player contribution stat would be Corsi, adjusted for shot quality. You've essentially made the single biggest building block towards this goal - haven't you?

I want to cry! This is beautiful work. And you're only 17!

I've often said a "perfect" player contribution stat would be Corsi, adjusted for shot quality. You've essentially made the single biggest building block towards this goal - haven't you?

1) I agree it's not perfect and some points do end up on the other side of the goal/blue line, and obviously each shot at their respective arenas do not all vary by the same distance, but I considered all the other options and decided that this would be closest to their actual locations. I've also attempted to ease this error by regressing the points. Since we can only really measure trends in recorder bias, I think the current method is good enough. A better solution could be using visual anchors like faceoff circle/dot, goal line, blue line, instead of pos/neg x/y points to correct the recording bias, basing on the assumption that the recorders plot shot locations using those visual anchors, but I think even then, I would have to regress the points to a certain extent, and the difference between that method and my current method will be marginal.

One possible way would be to make a proportional adjustment for each arena. For instance, instead of estimating that, let's say Florida, reports shots as x feet farther from the net compared to the league average, you could estimate the amount of percent away from league average the arena records. In that way, a shot taken closer to the net would get a smaller adjustment.

Quote:

Originally Posted by Wesleyy

2) The 5 feet radius is partly arbitrary. I decided upon 5 ft for 2 reasons. One, because it was the approximate distance from a player's stick blade to his skate, so a recorder could technically have a 5 feet margin of error either side depending on what handiness the player is. Two, because it was the largest distance bias a arena had (NYI with -4.3 and 3.3 ft on the positive end). The 75% exponential weighting was definitely arbitrary though. I'm not familiar with non-parametric regression, my understanding is that it selects weights based on the amount of data points available? Since I am not familiar with it, I can't say for sure, but since, like I mentioned before, we can really only measure trends in recorder bias, an improved regression method will most likely only have a minute effect on the data but seems to add a whole lot more in terms of complexity to the stat.

As I said, it's a very minor point and not really worth taking the time to do. And yes, you are right that as the sample size increases the bandwidth will shrink, and there will be less extrapolation, as there is less uncertainty in the data.

Quote:

Originally Posted by Wesleyy

3) I agree with posting the interval, I think just including the lower bound confused some people. I probably will update the tables to include the probability and their intervals when I have the time. As for the Bayesian interval, I think it's parallel to confidence intervals and using one over an other would be essentially a lateral move. As for setting the prior to 0, I have no idea what you mean by that as, in my understanding, credible intervals relies completely on the prior to make an accurate prediction so setting them all to zero would make it useless? Maybe I'm misunderstanding something from your post.

If I understood you correctly, you chose the lower end of the confidence interval to avoid the problem of some players getting an excellent stat based on a small sample size. My suggestion was therefore to instead set a conservative prior that will let the data converge to the "true effect" as the sample size grows. But, as I said, I'm not that familiar with Bayesian methods so I might be completely off base.

Quote:

Originally Posted by Wesleyy

4) What would a Corsi vs LAEGAP plot prove? How is that a measurement of the importance of shot location? It is 100% certain that shooting at a historically high percentage location will have a higher chance of going in versus shooting at a historically low percentage location. Again, maybe I'm misunderstanding something.

You motivated the reason for using LAEGAP by saying that it is an upgrade over CORSI. Therefore it would be interesting to see how much the two stats differ from each other. If they differ a lot, then it would suggest that taking shooting location into account is really important.