HFBoards (http://hfboards.hockeysfuture.com/index.php)
-   By The Numbers (http://hfboards.hockeysfuture.com/forumdisplay.php?f=241)
-   -   Modified Save Percentage - Goals Against and Scoring Chances (http://hfboards.hockeysfuture.com/showthread.php?t=1373319)

 hairylikebear 03-11-2013 03:31 PM

Modified Save Percentage - Goals Against and Scoring Chances

I've done some rudimentary analysis on the scoring chance information that some (about half) of the teams have available through bloggers. I say rudimentary because there is a lot of blogs that only track even strength scoring chances (ESSC), many that don't track them at all, and the ones that do track it differ fairly significantly on their interpretation of a scoring chance. In any case, I believe there is enough information out there to accurately judge the viability of my method, even if we cannot draw any hard and fast conclusions from it. I'll explain after the tables the methods I used to try and minimize the impact of these data inconsistencies.

Well anyway, the tables!
SCFA/SCAA Scoring Chance For/Against Average
SCSP Scoring Chance Save percentage
Team
SCFA
SCAA
+/-
ANA15.1114.910.19
BOS16.1411.005.15
BUF13.8216.64-2.82
CAR17.7117.420.29
CBJ14.9911.593.40
CGY16.5715.271.30
CHI17.7612.844.92
COL13.3820.04-6.66
DAL15.7817.60-1.83
DET17.0911.985.11
EDM16.8318.96-2.13
FLA14.0017.50-3.50
LAK14.9613.371.59
MIN15.0010.005.00
MTL14.5014.000.50
NJD13.6013.300.30
NSH12.2111.260.95
NYI16.7815.111.67
NYR17.5013.004.50
OTT15.0015.67-0.67
PHI15.0015.29-0.29
PHX14.2016.82-2.62
PIT17.6313.753.88
SJS14.5813.690.89
STL13.2413.77-0.53
TBL16.1713.832.33
TOR13.5013.460.04
VAN12.4815.39-2.91
WPG10.6714.00-3.33
WSH14.4013.001.40

Player
GAA
SCSP
Anderson1.490.9049
Lehtonen2.230.8733
Fasth1.920.8713
Varlamov2.760.8623
Niemi1.920.8598
Luongo2.190.8577
Crawford1.910.8512
Dubnyk2.880.8481
Ward2.840.8369
Price2.370.8307
Miller2.830.8299
Brodeur2.270.8293
Schneider2.630.8291
Lundqvist2.240.8277
Halak2.380.8272
Smith2.970.8234
Emery2.280.8224
Bryzgalov2.770.8188
Scrivens2.460.8173
Rinne2.090.8143
Theodore3.290.8120
Quick2.550.8092
Bobrovsky2.270.8041
Hiller2.930.8035
Fleury2.710.8029
Riemer2.660.8024
Nabokov3.020.8001
Pavelec2.820.7986
Howard2.460.7947
Garon2.850.7940
Vokoun3.070.7767
Hedberg2.970.7767
Kipper3.430.7753
Lindback3.110.7752
Holtby2.960.7723
Backstrom2.320.7680

 hairylikebear 03-11-2013 03:32 PM

First and foremost, here is the spreadsheet.

Below are the problems I noticed with data and how I approached minimizing them.

1. The team blogger only shows even strength scoring chance data (LAK, SJS)

Many teams had both even strength scoring chance data and overall data, so I calculated power play minutes per game of every team in the league, and averaged the ratio between PPM/G and the difference between ESSC and SC weighted towards the number of data points I had for that particular team. Basically, the more recorded games I had, the more their value influenced the overall average. The idea is this will give me a value for roughly how many scoring chances to expect per, for example, 2 minutes of power play time. Then it became a simple process of adding the expected PP and PK (it was the exact same process with PKM/G) scoring chances to SCF and SCA respectively. This correction value is highlighted in a light orange in the spreadsheet. Any chances given up while on the power play or generated while shorthanded are unfortunately ignored entirely, because it can vary so wildly from team to team and I did not want to apply any kind of average to every team.

2. I could not find scoring chance data for a particular game

I would record chances for any game against a team for whom I could find data. If two teams that are missing data play, well I just pretend those games don't exist for statistical purposes and place a huge mental asterisk on any averages for those teams.

3. Two teams have conflicting data for the same game

This actually wasn't a huge problem. Most of the time, the difference was only one or two chances, and often the numbers would match up entirely. However, there are some blogs that are conservative with giving out scoring chances (Toronto, San Jose, Los Angeles, for example) and some that are liberal (Carolina). I also noticed a few cases where the blogger in question was conservative with scoring chances against, but very liberal with scoring chances for (the Flames guy was atrocious about this). In any case where I get conflicting data, I just take whichever value is higher and use it. The logic is that while there may be some debate on what constitutes a scoring chance, there's a pretty strong consensus on the stuff that is not a scoring chance. Since I'm mainly focused on eliminating those perimeter shots that are stopped 99% of the time, I choose to include any chance that's arguable because odds are it's probably still a legitimate scoring chance.

 Trebek 03-11-2013 06:59 PM

Very interesting - thanks for putting it together.

I went down this road a few years back, and where I got tripped up was here: the term "shot on goal" is well-defined in the National Hockey League, and yet we see examples where one scorer consistently over (or under)counts.

The term "scoring chance" is not well-defined, so I'd expect the variance from rink to rink to be much larger. How do we best account for this?

 hairylikebear 03-11-2013 07:24 PM

Quote:
 Originally Posted by Taco MacArthur (Post 61461717) The term "scoring chance" is not well-defined, so I'd expect the variance from rink to rink to be much larger. How do we best account for this?
Almost everyone that tracks this stuff has a pretty consistent definition, but not everyone uses the same one. They all seem to have the same rules with regard to the "home plate" though some use a straight line between the dots, some use a rounded line, some are very lenient about borderline cases and some are very strict. Some include screened point shots on goal, some include shots from outside home plate after X amount of puck movement (where X is completely subjective), some automatically include any shot on goal generated from an odd man rush, etc. All of this is simply a function of the lack of any central authority enforcing the standards for what a scoring chance should be. If the problem is that it is not well defined, we just need to define it.

The variance from rink to rink can be corrected by adding more people to record the data, which is obviously way easier said than done, but even having one person from each team working on it improves the data tremendously. With enough data, we can begin to see outliers that highlight discrepancies in recording styles. If one team seems to always record x% lower scoring chances than the team they are playing, we can correct their numbers, though individual game data will still be somewhat dubious. However, it should be emphasized that averaging the numbers recorded for the same game will not improve accuracy.

 SephF 03-11-2013 09:39 PM

Nice work, thanks for putting this together - very interesting.

 Pi 03-11-2013 09:47 PM

Wow, this data does have some serious evidence IMO.

Anderson has truly been unbelievable this year..stopping 90% of legit scoring chances is amazing.

I'd love to know what the data for last year compares to this one.

 Fish on The Sand 03-17-2013 02:55 PM

The single biggest problem with this is that not all scoring chances are shots on goal so it is difficult to assign a save % on chances where the goaltender did not have to make a save.

 hairylikebear 03-18-2013 03:08 AM

Quote:
 Originally Posted by Fish on The Sand (Post 61844841) The single biggest problem with this is that not all scoring chances are shots on goal so it is difficult to assign a save % on chances where the goaltender did not have to make a save.
Not necessarily. A goalie simply being in position to force a shooter to shoot wide is noteworthy. The only times a scoring chance does not also register a shot on goal are situations within the scoring chance criteria in which the shooter (not named Patrik Stefan) would score every time if the goalie was not there, such as wide open shots from the slot.

 Trebek 03-18-2013 10:16 AM

Quote:
 Originally Posted by hairylikebear (Post 61875429) Not necessarily. A goalie simply being in position to force a shooter to shoot wide is noteworthy. The only times a scoring chance does not also register a shot on goal are situations within the scoring chance criteria in which the shooter (not named Patrik Stefan) would score every time if the goalie was not there, such as wide open shots from the slot.
While I agree with your point, his point (also valid) is that in the situation described, there isn't a save to be made, and so how do you calculate a "save percentage" for that situation?

The answer is probably (1 - goals allowed) / (scoring chances), without regard for whether or not a save was actually made.

 Fish on The Sand 03-18-2013 10:44 AM

Quote:
 Originally Posted by hairylikebear (Post 61875429) Not necessarily. A goalie simply being in position to force a shooter to shoot wide is noteworthy. The only times a scoring chance does not also register a shot on goal are situations within the scoring chance criteria in which the shooter (not named Patrik Stefan) would score every time if the goalie was not there, such as wide open shots from the slot.
Or if a dman makes a save on behalf of the goalie (no shot), the shooter hits the post (no shot) or any other number of instances where there is no shot recorded.

 Curtinho 03-18-2013 11:56 AM

Would like to see what the numbers look like for Lehner and Bishop, too.

Anderson has been out of this world.

 hairylikebear 03-18-2013 05:08 PM

Quote:
 Originally Posted by Taco MacArthur (Post 61881325) While I agree with your point, his point (also valid) is that in the situation described, there isn't a save to be made, and so how do you calculate a "save percentage" for that situation? The answer is probably (1 - goals allowed) / (scoring chances), without regard for whether or not a save was actually made.
Scoring chance (SC) sv% is just GA/SC. Whether it's a shot or not is irrelevant.

Quote:
 Originally Posted by Fish on The Sand (Post 61882505) Or if a dman makes a save on behalf of the goalie (no shot), the shooter hits the post (no shot) or any other number of instances where there is no shot recorded.
Blocked shots don't count as scoring chances. Hitting the post is the same as missing the net.

 Trebek 03-18-2013 05:23 PM

Quote:
 Originally Posted by hairylikebear (Post 61900975) Scoring chance (SC) sv% is just GA/SC. Whether it's a shot or not is irrelevant.
Correcting for the numerator error (you've got scoring chance goal percentage), that's exactly what I wrote.

 hairylikebear 03-18-2013 08:19 PM

Quote:
 Originally Posted by Taco MacArthur (Post 61901525) Correcting for the numerator error (you've got scoring chance goal percentage), that's exactly what I wrote.
My mistake, we're on the same page then :D

 BoHorvatFan 03-18-2013 08:48 PM

There is a big difference between a Crosby scoring chance and a Colton Orr scoring chance. They are not equal, so as long as there's just a defined ''scoring chance'' no matter the player the advanced stat, like all hockey advanced stats, will be badly flawed. There are so many variables in hockey, WAY more than baseball and you have to consider everything or its just misleading.

 hairylikebear 03-19-2013 03:14 AM

Quote:
 Originally Posted by NugentHopkinsfan (Post 61916079) There is a big difference between a Crosby scoring chance and a Colton Orr scoring chance. They are not equal, so as long as there's just a defined ''scoring chance'' no matter the player the advanced stat, like all hockey advanced stats, will be badly flawed. There are so many variables in hockey, WAY more than baseball and you have to consider everything or its just misleading.
Well that's true of any of the traditional statistics as well. There are tap in goals and there are solo deke through 5 people snipe top corner goals, but they still count the same. Then there's an assist that generates a tap in vs an outlet pass to a guy who does all the work that both count the same. Even with the current sv% metric, there are Jeff Woywitka points shots and there are Crosby shots from the slot that both count the same.

Scoring chances are still susceptible to the same flaws. It actually exists as a response to your criticism with regard to shots and saves. The beauty of scoring chances is that, if we wanted to, we could consider the caliber of the shooter in borderline cases.

 GuineaPig 03-19-2013 07:41 AM

The issue here is sample size. Save percentage itself is heavily dependent on luck over the course of a season, and it becomes even more important if you reduce the sample (both in the type of shots, and number of games). It's interesting to look at, but pretty meaningless as it stands.

 Micklebot 03-19-2013 08:58 AM

Quote:
 Originally Posted by Cujomi (Post 61885731) Would like to see what the numbers look like for Lehner and Bishop, too. Anderson has been out of this world.
By his methodology,

Bishop .8359
Lehner .8787

The problem as I see it is that the SCA is an average for the whole team, not just the starters, so if one goalie gets sheltered games, the team buckles down on chances allowed for their back-up or for any reason some games skew the averages, the numbers won't be entirely accurate for the goalie specific stats.

 hairylikebear 03-19-2013 01:16 PM

Quote:
 Originally Posted by GuineaPig (Post 61945101) The issue here is sample size. Save percentage itself is heavily dependent on luck over the course of a season, and it becomes even more important if you reduce the sample (both in the type of shots, and number of games). It's interesting to look at, but pretty meaningless as it stands.
As far as SV% is concerned, all goals against will be counted, "lucky" or otherwise. Shots from the scoring chance pentagon will be included, regardless of luck. The only situation in which there will be a difference involving luck is when a "lucky" shot from outside of the scoring chance area goes in (ie a softy, which counts as a GA but not a scoring chance) or when the goalie makes a "lucky" save on a shot that comes from outside the SC area. The idea behind the pentagon is that any shot from outside the pentagon is relatively harmless, and therefore no save from there can be a "lucky" one. Unless there is a deflection, but those count as scoring chances.

It does reduce the sample size a bit, but at least the dependent variable is kept the same so it minimizes that effect as much as possible.

Quote:
 Originally Posted by Micklebot (Post 61947153) By his methodology, Bishop .8359 Lehner .8787 The problem as I see it is that the SCA is an average for the whole team, not just the starters, so if one goalie gets sheltered games, the team buckles down on chances allowed for their back-up or for any reason some games skew the averages, the numbers won't be entirely accurate for the goalie specific stats.
That's not true. I did it that way because I'm missing data and also because I'm lazy. If I had scoring chance data for every game (or even a timestamp for every scoring chance), it would then be possible to determine who was on the ice for them, including the goalie. After that the math is pretty simple.

 Micklebot 03-19-2013 01:42 PM

Quote:
 Originally Posted by hairylikebear (Post 61958443) That's not true. I did it that way because I'm missing data and also because I'm lazy. If I had scoring chance data for every game (or even a timestamp for every scoring chance), it would then be possible to determine who was on the ice for them, including the goalie. After that the math is pretty simple.
Not trying to criticize, I think the data is pretty cool. But I'm not sure where I was off base. From what I could tell, you used the average scoring chances per game, and adjusted each starters GAA toi yield the SCA Sv%. if that's the case, I think what I said was accurate.

Most of the teams Scoring chances data I've seen is game by game, so you could track down the starter for each game to refine it, but it's probably more effort than its worth. TBH, I think data for backups would be skewed simply because of the smaller sample size.

Anyhow, I enjoyed your work and completely off topic, found it funny that the tables auto-sort does not handle negative numbers well.

 hairylikebear 03-19-2013 01:53 PM

Quote:
 Originally Posted by Micklebot (Post 61959585) Not trying to criticize, I think the data is pretty cool. But I'm not sure where I was off base. From what I could tell, you used the average scoring chances per game, and adjusted each starters GAA toi yield the SCA Sv%. if that's the case, I think what I said was accurate. Most of the teams Scoring chances data I've seen is game by game, so you could track down the starter for each game to refine it, but it's probably more effort than its worth. TBH, I think data for backups would be skewed simply because of the smaller sample size. Anyhow, I enjoyed your work and completely off topic, found it funny that the tables auto-sort does not handle negative numbers well.
You're right to criticize, completely. That's exactly how I did it and I agree it is a pretty major flaw as it stands. However, it's just a demonstration of the concept.
In a perfect world we would have people from all thirty teams measuring scoring chances under very strict and thorough guidelines with timestamps. If we had that quality of data, I believe we could have the first "advanced" goalie stat on our hands.

 Micklebot 03-19-2013 02:23 PM

Quote:
 Originally Posted by hairylikebear (Post 61960257) You're right to criticize, completely. That's exactly how I did it and I agree it is a pretty major flaw as it stands. However, it's just a demonstration of the concept. In a perfect world we would have people from all thirty teams measuring scoring chances under very strict and thorough guidelines with timestamps. If we had that quality of data, I believe we could have the first "advanced" goalie stat on our hands.
Closest I can think of is shot trackers and counting only shots from "home plate", but it lacks shots that go wide, are blocked, or hit the post. Also, they tend to lack coordinates for some shots.

This site tracks shots and what player is on the ice. Not sure where he mines the data from, but it's out there somewhere:
http://somekindofninja.com/nhl/

TSN also has game by game shot/blocks/hits/penalty location trackers:
http://gametracker-nhl.tsn.ca/17132

Quote:
 Originally Posted by hairylikebear (Post 61960257) You're right to criticize, completely. That's exactly how I did it and I agree it is a pretty major flaw as it stands. However, it's just a demonstration of the concept. In a perfect world we would have people from all thirty teams measuring scoring chances under very strict and thorough guidelines with timestamps. If we had that quality of data, I believe we could have the first "advanced" goalie stat on our hands.
GameCenter could be the solution to this. Get a group of fans together to go through every game from this season and record the scoring chances and the goaltender in net for said chance.

It may improve the original stats by allowing you to a) strictly define 'scoring chance,' b) avoid missing data sets, c) devise a way to further eliminate homerism either through participant selection criteria or some other mechanism

 Beef Invictus 03-19-2013 03:07 PM

Quote:
 Originally Posted by CanadianHockey (Post 61962417) GameCenter could be the solution to this. Get a group of fans together to go through every game from this season and record the scoring chances and the goaltender in net for said chance. It may improve the original stats by allowing you to a) strictly define 'scoring chance,' b) avoid missing data sets, c) devise a way to further eliminate homerism either through participant selection criteria or some other mechanism
I saw an article which provided an outline for the "scoring chance" area. It was basically the slot; the area on both sides of the net, between the faceoff dots back to the top of the circles, including the whole top of the circles. I'll see if I can dig it up.

Edit: Here's one article: http://nhlnumbers.com/2012/6/26/shot...nd-shot-totals

Here's the other, with the outline thingy: http://www.broadstreethockey.com/201...player-results

So, that could be used along with Gamecenter to provide a more concrete definition of "scoring chance."

 Micklebot 03-19-2013 03:27 PM

Quote:
 Originally Posted by Beef Invictus (Post 61964331) I saw an article which provided an outline for the "scoring chance" area. It was basically the slot; the area on both sides of the net, between the faceoff dots back to the top of the circles, including the whole top of the circles. I'll see if I can dig it up. Edit: Here's one article: http://nhlnumbers.com/2012/6/26/shot...nd-shot-totals Here's the other, with the outline thingy: http://www.broadstreethockey.com/201...player-results So, that could be used along with Gamecenter to provide a more concrete definition of "scoring chance."
Yeah, that's the "home plate" definition, which is probably the most widely used. The one thing that becomes a point of concern is when a shot from outside the home plate scores. You get a goal without a scoring chance. Or when you have a back door play where the goalie plays the shooter but a pass is made to someone just outside of the home plate with a wide open cage.

Nothing will be perfect, but I think some degree of judgement needs to come into play when determining a scoring chance.

All times are GMT -5. The time now is 07:41 AM.