HFBoards

Go Back   HFBoards > General Hockey Discussion > By The Numbers
By The Numbers Hockey Analytics... the Final Frontier. Explore strange new worlds, to seek out new algorithms, to boldly go where no one has gone before.

The tricky thing about finding the 'be all, end all' stat

Reply
 
Thread Tools
Old
01-16-2014, 06:27 PM
  #1
Chalupa Batman
Mod Supervisor
 
Chalupa Batman's Avatar
 
Join Date: Sep 2005
Posts: 21,934
vCash: 500
The tricky thing about finding the 'be all, end all' stat

http://www.sbnation.com/nhl/2014/1/1...cs-limitations

Blog post about the problems inherent in finding the "one true value" metric in hockey. Reasonably well thought out.

Chalupa Batman is offline   Reply With Quote
Old
01-17-2014, 04:59 PM
  #2
Hardyvan123
tweet@HardyintheWack
 
Join Date: Jul 2010
Location: Vancouver
Country: Canada
Posts: 11,365
vCash: 500
will read it but the largest problem is that in other sports, like baseball, where all advanced stats really come from and I love bill James alot, is that one can isolate the hitter and pitcher and get something more accurate from any look at the stats than in a game hockey which which transitions so quickly and has so many more variables in play and thus more chances for variance as well.

Hardyvan123 is offline   Reply With Quote
Old
01-17-2014, 05:57 PM
  #3
hatterson
Global Moderator
 
hatterson's Avatar
 
Join Date: Apr 2010
Location: North Tonawanda, NY
Country: United States
Posts: 10,923
vCash: 874
Send a message via Skype™ to hatterson
Quote:
Originally Posted by Hardyvan123 View Post
will read it but the largest problem is that in other sports, like baseball, where all advanced stats really come from and I love bill James alot, is that one can isolate the hitter and pitcher and get something more accurate from any look at the stats than in a game hockey which which transitions so quickly and has so many more variables in play and thus more chances for variance as well.
That's one of the significant issues raised in the article. Baseball is basically a perfectly built sport for analytics. It's built on one-on-one matchups that are largely independent. Hockey has the issues of defensemen being influenced by forwards and viceversa.

However, even if hockey was much more one-on-one, the fluid nature of the sport also leaves issues. Baseball is broken down into distinct plays. One guy is offence and one guy/team is defense. A play in hockey can involve rapid shifts from O to D and back again. Sometimes without the puck carrier doing anything different. If I'm rushing through the neutral zone with 3 teammates I'm on offense, but if I'm doing it as their peeling to the bench for a change is that still offense? Is it really just defense by letting them replace themselves?

__________________
2013/2014 NHL suspensions tracker - Here
2013/2014 Maple Leafs Prediction Contest Last 2 Games - 2013/04/10 @ Florida Panthers & 2013/04/12 @ Ottawa Senators
Come join us on the By The Numbers forum. Take a look at our introduction post if you're new. If you have any questions, feel free to PM me.
hatterson is online now   Reply With Quote
Old
01-17-2014, 07:26 PM
  #4
TertiaryAssist
Registered User
 
TertiaryAssist's Avatar
 
Join Date: Oct 2011
Location: Pittsburgh
Posts: 777
vCash: 500
Quote:
Originally Posted by hatterson View Post
That's one of the significant issues raised in the article. Baseball is basically a perfectly built sport for analytics. It's built on one-on-one matchups that are largely independent. Hockey has the issues of defensemen being influenced by forwards and viceversa.

However, even if hockey was much more one-on-one, the fluid nature of the sport also leaves issues. Baseball is broken down into distinct plays. One guy is offence and one guy/team is defense. A play in hockey can involve rapid shifts from O to D and back again. Sometimes without the puck carrier doing anything different. If I'm rushing through the neutral zone with 3 teammates I'm on offense, but if I'm doing it as their peeling to the bench for a change is that still offense? Is it really just defense by letting them replace themselves?
Plus, virtually every event on a baseball field is recorded, while we're still missing things like zone time, true puck possession, passing data, etc. Even finding out what players were on the ice for a given event can be a chore.

Only football is as difficult, I believe - it's broken into discrete plays like baseball and, to a lesser extent, basketball, but player roles are more highly specialized, and much of the important action takes place away from the ball and isn't recorded. A sample size of 16-20 games per season, roughly 120 plays each, just makes things worse.

TertiaryAssist is offline   Reply With Quote
Old
01-17-2014, 07:42 PM
  #5
Mathletic
Registered User
 
Mathletic's Avatar
 
Join Date: Feb 2002
Location: St-Augustin, Québec
Country: Canada
Posts: 11,336
vCash: 500
There's not a whole lot of interest in building that kind of metric as far as I'm concerned. Even in baseball. Sure WAR is fine when you want a number to tell you how much of an impect a player has. However, for a team in real life, it's close to useless. In order to optimize lineups, defensive strategies and whatnot, you have to use much more meaningful stats. Doesn't matter whether or not baseball is easier to isolate or not.

Mathletic is offline   Reply With Quote
Old
01-19-2014, 01:52 PM
  #6
overpass
Registered User
 
Join Date: Jun 2007
Posts: 3,509
vCash: 500
Another major problem with the "all-in-one" metrics is the desire to apply them over any time scale and have consistent results that will sum.

The nature of hockey statistics is that no statistic perfectly describes the exact contribution of a given player to the result that was measured. The best we can do is look at our population of outcomes, determine which statistics describe the contribution of the player toward winning, and assign a corresponding value to those statistics. The resulting metric will be an estimate, not an exact measurement.

Since those values are derived based on the population, they may need to be regressed when applying them to a smaller sample. But the amount of regression that is needed depends on the size of the sample. No all-in-one metrics in hockey will change the amount of regression depending on the size of the sample, because they want the metric to be consistent in terms of game totals adding to seasonal totals adding to career totals.

Let's say an all-in-one hockey metric is created that uses the principle that observed shooting percentage is part randomness and part skill. So player shooting percentage is regressed partially towards league average in order to remove the randomness component, and the amount of regression is picked with reference to the amount of randomness in single season shooting percentage. The problem is that this will overrate the skill component and underrate the randomness component in single game shooting percentage, and it will underrate the skill component and underrate the randomness component in career shooting percentage. When a fixed regression figure is involved in a metric, the metric is only accurate for a particular sample size. In this case the metric would be a valid estimate for single season value (assuming we knew nothing else about the player but his statistics for that season), but not for single game value or career value.

This is an unsolved problem in WAR for baseball. There is not enough regression involved in their fielding metrics over a partial season, which means too much value is given to defensive statistics over a partial season. Similarly, there is too much regression involved in their fielding metrics over a career, which means there is not enough value given to defensive statistics over a career.

The entire problem is based on a misunderstanding of the all-in-one metrics. They are estimates, not counting stats. One possible solution is regressing single season numbers and/or single game numbers to a player mean based on the player's performance in other seasons instead of a general population mean. But I haven't seen any interest in that among the creators of all-in-one metrics - probably because it makes the calculations much more time-consuming.

overpass is offline   Reply With Quote
Old
02-06-2014, 10:17 AM
  #7
Chalupa Batman
Mod Supervisor
 
Chalupa Batman's Avatar
 
Join Date: Sep 2005
Posts: 21,934
vCash: 500
Good St. Louis blog about problems with universal player metrics:

http://www.stlouisgametime.com/2014/...-why-they-fail

Basically a crash course in sample size and variations that result.

Chalupa Batman is offline   Reply With Quote
Old
02-10-2014, 12:29 AM
  #8
Gibsons Finest
Beast
 
Gibsons Finest's Avatar
 
Join Date: Jul 2003
Location: Saskatoon/Brandon
Country: Canada
Posts: 17,383
vCash: 500
Hell, for how perfect baseball is in analytics, WAR is highly flawed. From overrating defense to seemingly random position values(that have a lot more influence than they should IMO), it's really only good if you want to take a quick glance and get a grasp for where a player's at. At the same time, in any sport, you probably won't do much better when it comes to "universal" stats. Simply put, you're probably not going to get one. Hell, even for quarterbacks in football, QB rating has some very serious flaws.

Gibsons Finest is online now   Reply With Quote
Old
02-12-2014, 11:32 AM
  #9
wgknestrick
Registered User
 
Join Date: Aug 2012
Posts: 1,470
vCash: 500
The only tricky thing is figuring out how to separate defensive contributions (or shares) and how they relate towards winning. As this develops, we will start to have a much better grasp on a single "rating" to rule them all.

This is why it is much more difficult to draft defensemen and almost impossible to draft goaltenders compared to forwards. There are not enough data points recorded on defensive plays to work with. For every goal, they record the goal scorer and the 2 players to touch the puck last. They surely don't do anything near that for the defensive players that were scored on. If they start recording these things, the systems will develop behind it.

wgknestrick is offline   Reply With Quote
Old
02-18-2014, 08:59 AM
  #10
schuckers
Registered User
 
Join Date: Feb 2013
Posts: 31
vCash: 500
Quote:
Originally Posted by wgknestrick View Post
The only tricky thing is figuring out how to separate defensive contributions (or shares) and how they relate towards winning. As this develops, we will start to have a much better grasp on a single "rating" to rule them all.

This is why it is much more difficult to draft defensemen and almost impossible to draft goaltenders compared to forwards. There are not enough data points recorded on defensive plays to work with. For every goal, they record the goal scorer and the 2 players to touch the puck last. They surely don't do anything near that for the defensive players that were scored on. If they start recording these things, the systems will develop behind it.
these data are available already and are being used. For every shot, blocked shot and miss (as well as hits,etc) , we have who was on the ice for both teams. We can link those events with who was on the ice (for and against) to get a sense of who is 'driving' play. This is what the Total Hockey Ratings (THoR) does and what the Expected Goals model of Brian Macdonald does.

Now these data are only available for the NHL and so they don't have much impact on drafting, yet.

schuckers is offline   Reply With Quote
Old
02-18-2014, 09:43 AM
  #11
wgknestrick
Registered User
 
Join Date: Aug 2012
Posts: 1,470
vCash: 500
Quote:
Originally Posted by schuckers View Post
these data are available already and are being used. For every shot, blocked shot and miss (as well as hits,etc) , we have who was on the ice for both teams. We can link those events with who was on the ice (for and against) to get a sense of who is 'driving' play. This is what the Total Hockey Ratings (THoR) does and what the Expected Goals model of Brian Macdonald does.

Now these data are only available for the NHL and so they don't have much impact on drafting, yet.
The same THoR that has Tyler Kennedy as an elite, top 3 forward in the NHL? Or the THoR that list Kimmo Timonen as the best Defenseman in league?

wgknestrick is offline   Reply With Quote
Old
02-18-2014, 10:09 AM
  #12
schuckers
Registered User
 
Join Date: Feb 2013
Posts: 31
vCash: 500
Based upon those data and that methodoloy used, yes.

Isn't it funny that Kennedy has the 2nd highest rate of shots/TOI of anyone in the NHL over that period. Tough for the other team to score when you're putting up so many shots.

Wouldn't be much of a method if we didn't learn something.

schuckers is offline   Reply With Quote
Old
02-18-2014, 10:21 AM
  #13
Delicious Dangles
Registered User
 
Join Date: Oct 2013
Posts: 1,022
vCash: 500
Quote:
Originally Posted by schuckers View Post
these data are available already and are being used. For every shot, blocked shot and miss (as well as hits,etc) , we have who was on the ice for both teams. We can link those events with who was on the ice (for and against) to get a sense of who is 'driving' play. This is what the Total Hockey Ratings (THoR) does and what the Expected Goals model of Brian Macdonald does.

Now these data are only available for the NHL and so they don't have much impact on drafting, yet.
Those real-time stats and inputs being used are recorded with no consistency or standards of regulation, and thus any ratings or models that draw from them are essentially completely worthless.

They are far from the only variables to consider either (in fact, some "stats" that people think suggest defensive ability, have no correlation at all and are sometimes actually more representative of the opposite), so the model would be essentially completely worthless anyway.

Delicious Dangles is offline   Reply With Quote
Old
02-18-2014, 10:40 AM
  #14
wgknestrick
Registered User
 
Join Date: Aug 2012
Posts: 1,470
vCash: 500
Quote:
Originally Posted by schuckers View Post
Based upon those data and that methodoloy used, yes.

Isn't it funny that Kennedy has the 2nd highest rate of shots/TOI of anyone in the NHL over that period. Tough for the other team to score when you're putting up so many shots.

Wouldn't be much of a method if we didn't learn something.
I wonder what his SH% is? Probably high

I have watched more of TK than most people and his "shot production" needs to be taken with a huge grain of salt. He constantly throws the puck on net with no intent on scoring (Usually without a man even being there for deflections or rebounds). He is hoping for rebounds/misplays etc. We can argue that there is no such thing as a bad puck on net, but he also interrupts many more dangerous chances by blindly throwing pucks on nets from bad angles and huge distances. Many times the goaltenders just freeze it or play goes the other way.

Why doesn't TK have one of the highest 5v5 GF/60 in the league?
Why doesn't' TK have one of the highest 5v5 GF% in the league?

All shots and shot attempts are not equal, but THOR's system assumes they are. This is why it does not correctly identify TK for who he is. We are missing data to weed the TKs out of THOR's ranking. It is even worse for defensive data. Just because THOR uses all the "available data" does not mean it comes out with the proper results. The NHL needs to record more. THOR is too easy to pick apart to use it as a reference for a single stat. It still needs significant development.

Comparison of Crosby vs TK shooting locations (may have to type in players as links doesn't work all the time):
http://www.sportingcharts.com/nhl/ic...&r2strength=#1

wgknestrick is offline   Reply With Quote
Old
02-18-2014, 10:54 AM
  #15
schuckers
Registered User
 
Join Date: Feb 2013
Posts: 31
vCash: 500
Quote:
Originally Posted by wgknestrick View Post
I wonder what his SH% is? Probably high

I have watched more of TK than most people and his "shot production" needs to be taken with a huge grain of salt. He constantly throws the puck on net with no intent on scoring (Usually without a man even being there for deflections or rebounds). He is hoping for rebounds/misplays etc. We can argue that there is no such thing as a bad puck on net, but he also interrupts many more dangerous chances by blindly throwing pucks on nets from bad angles and huge distances. Many times the goaltenders just freeze it or play goes the other way.

Why doesn't TK have one of the highest 5v5 GF/60 in the league?
Why doesn't' TK have one of the highest 5v5 GF% in the league?

All shots and shot attempts are not equal, but THOR's system assumes they are. This is why it does not correctly identify TK for who he is. We are missing data to weed the TKs out of THOR's ranking. It is even worse for defensive data. Just because THOR uses all the "available data" does not mean it comes out with the proper results. The NHL needs to record more. THOR is too easy to pick apart to use it as a reference for a single stat. It still needs significant development.

Comparison of Crosby vs TK shooting locations (may have to type in players as links doesn't work all the time):
http://www.sportingcharts.com/nhl/ic...&r2strength=#1
THoR doesn't assume that all shot attempts are equal (a la Corsi or Fenwick), it gives different values to different shot locations and different shot types. His scoring rate is low, no doubt and that is not currently in THoR because of the work done on SH% saying that it is too noisy in the short term. Nobody is using THoR blindly.

Here's my question for you: what are the proper results? And how do you know?

Again, I will readily concede THoR is not perfect but that should not be the enemy of the good.

schuckers is offline   Reply With Quote
Reply

Forum Jump


Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -5. The time now is 08:53 AM.

monitoring_string = "e4251c93e2ba248d29da988d93bf5144"
Contact Us - HFBoards - Archive - Privacy Statement - Terms of Use - Advertise - Top - AdChoices

vBulletin Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
HFBoards.com is a property of CraveOnline Media, LLC, an Evolve Media, LLC company. ©2014 All Rights Reserved.