HFBoards

Go Back   HFBoards > General Hockey Discussion > By The Numbers
Mobile Hockey's Future Become a Sponsor Site Rules Support Forum vBookie Page 2
Notices

By The Numbers Hockey Analytics... the Final Frontier. Explore strange new worlds, to seek out new algorithms, to boldly go where no one has gone before.

What to Look for when Finding Comparable Players?

Reply
 
Thread Tools
Old
01-23-2016, 03:21 PM
  #1
eyjee
the basement
 
eyjee's Avatar
 
Join Date: Jun 2008
Location: Alphabet Soup
Country: United States
Posts: 27,231
vCash: 500
What to Look for when Finding Comparable Players?

Pretty simple ask here, but as with all things I put my hands on, I'm going to make it more complicated than it probably is

I'm trying to find the best way to compare players to one another by looking at players with very similar games. Obviously, trying to do this in a data based way than a subjective way. For instance, I know I could probably compare JT Miller and Brandon Dubinsky, but does the data back that up? Who is a surprise comparison to JT Miller? Things like that...

The metrics I'm looking at are a bit more "tendency" based than statistically based. For instance, I'd rather look at how often a player shoots the puck rather than how many goals he scores. Is that ridiculous? Am I being too "nitpicky"?

Right now, here are the metrics I'm looking at:
  • Age
  • Primary Points per 60
  • Individual Shot Attempts per 60
  • Percent of shot attempts that are unblocked
  • Relative scoring chances for

And then a few things that are out of the players control that may provide an underlying meaning to their metrics:
  • tCF60
  • Relative Zone Starts

Curious what you guys think of this approach, and if you have any other recommended metrics that may be worthwhile?

Also, a bit OT, but is anyone a member of any good Slack channels that are dedicated to talking Hockey Analytics? I'd love to get involved in something like that, even if it's just being a fly on the wall. I think I account for 25% of the posts in the HFNYR Advanced Stats thread, and I think everyone there is tired of my crazy/preposterous projects that I take on (including this one)

eyjee is offline   Reply With Quote
Old
01-23-2016, 09:40 PM
  #2
Canadiens1958
Registered User
 
Canadiens1958's Avatar
 
Join Date: Nov 2007
Posts: 13,870
vCash: 500
Considerations

Quote:
Originally Posted by silverfish View Post
Pretty simple ask here, but as with all things I put my hands on, I'm going to make it more complicated than it probably is

I'm trying to find the best way to compare players to one another by looking at players with very similar games. Obviously, trying to do this in a data based way than a subjective way. For instance, I know I could probably compare JT Miller and Brandon Dubinsky, but does the data back that up? Who is a surprise comparison to JT Miller? Things like that...

The metrics I'm looking at are a bit more "tendency" based than statistically based. For instance, I'd rather look at how often a player shoots the puck rather than how many goals he scores. Is that ridiculous? Am I being too "nitpicky"?

Right now, here are the metrics I'm looking at:
  • Age
  • Primary Points per 60
  • Individual Shot Attempts per 60
  • Percent of shot attempts that are unblocked
  • Relative scoring chances for

And then a few things that are out of the players control that may provide an underlying meaning to their metrics:
  • tCF60
  • Relative Zone Starts

Curious what you guys think of this approach, and if you have any other recommended metrics that may be worthwhile?

Also, a bit OT, but is anyone a member of any good Slack channels that are dedicated to talking Hockey Analytics? I'd love to get involved in something like that, even if it's just being a fly on the wall. I think I account for 25% of the posts in the HFNYR Advanced Stats thread, and I think everyone there is tired of my crazy/preposterous projects that I take on (including this one)
Perhaps consider the following for skaters.

Position - LW/C/RW/LD/RD. Handedness - LHS/RHS.
Provenance. All of the game metrics above have elements of provenance. Not an exhaustive list but one tha serves as an example of the approach.

Relative Zone Starts. In either the offensive or defensive zone which centers get the left circle draws? which ones get the right circle draws? Is there a difference in performance by a specific center in the left circle or right circle in each zone.

Shooting related. Individual skater shooting skills are evaluated using a grid. A LW will take shots mainly from the left wing in the offensive zone, RW from the left wing. But where exactly on each wing are they shooting from? How successful are they from the various areas? Are the shots coming of a rush, set play following a faceoff, skill related shots, re-direct, deflections, wrap-arounds,etc.

So forth....

Canadiens1958 is online now   Reply With Quote
Old
01-24-2016, 10:49 PM
  #3
Yurog
Registered User
 
Join Date: Jan 2012
Location: Magnitogorsk
Country: Russian Federation
Posts: 135
vCash: 500
http://war-on-ice.com/similarity-scores.html

Yurog is offline   Reply With Quote
Old
01-25-2016, 09:07 AM
  #4
eyjee
the basement
 
eyjee's Avatar
 
Join Date: Jun 2008
Location: Alphabet Soup
Country: United States
Posts: 27,231
vCash: 500
Quote:
Originally Posted by Yurog View Post
Oh very interesting. Thanks!

eyjee is offline   Reply With Quote
Old
02-07-2016, 01:56 PM
  #5
eyjee
the basement
 
eyjee's Avatar
 
Join Date: Jun 2008
Location: Alphabet Soup
Country: United States
Posts: 27,231
vCash: 500
Bumping this back up... what do you guys think about these stats and the weight I am applying to them? I'm not in love with it, considering some of the results I'm getting

(ie. right now, Derick Brassard's top comparable for last season is Paul Gaustad's 2010-2011 season. Both of them having 10 5v5 goals is playing too big of a part maybe, but that's important, no?

iCF60 (weight: 2.5)
PST (Percent of shot attempts for that belong to the player - allows me to figure out the shooting tendencies | weight: 2.5)
PrimaryPoints per 60 (weight: 3.5)
Unblocked Shot Attempt Success (Individual Fenwick divided by Individual Corsi | weight: 2)
SCF.Relative (weight: 1.5)
TOI/Gm (weight: 2)
tCF60 (weight: 1)
cCF60 (weight: 1)
Goals (weight: 5)
Primary assists (weight: 4)

I'm not sure if there is a statistically relevant way to gauge what the correct weights should be? And I'm afriad the more I manipulate the data, the more it just becomes me making it what I want it to be.

Maybe Brassard last season and Gaustad in 2010-2011 are comparable

eyjee is offline   Reply With Quote
Old
06-17-2016, 12:50 PM
  #6
36kap36
Registered User
 
36kap36's Avatar
 
Join Date: Jan 2011
Location: Ohio
Country: United States
Posts: 868
vCash: 500
Quote:
Originally Posted by silverfish View Post
Bumping this back up... what do you guys think about these stats and the weight I am applying to them? I'm not in love with it, considering some of the results I'm getting

(ie. right now, Derick Brassard's top comparable for last season is Paul Gaustad's 2010-2011 season. Both of them having 10 5v5 goals is playing too big of a part maybe, but that's important, no?

iCF60 (weight: 2.5)
PST (Percent of shot attempts for that belong to the player - allows me to figure out the shooting tendencies | weight: 2.5)
PrimaryPoints per 60 (weight: 3.5)
Unblocked Shot Attempt Success (Individual Fenwick divided by Individual Corsi | weight: 2)
SCF.Relative (weight: 1.5)
TOI/Gm (weight: 2)
tCF60 (weight: 1)
cCF60 (weight: 1)
Goals (weight: 5)
Primary assists (weight: 4)

I'm not sure if there is a statistically relevant way to gauge what the correct weights should be? And I'm afriad the more I manipulate the data, the more it just becomes me making it what I want it to be.

Maybe Brassard last season and Gaustad in 2010-2011 are comparable
Are these just arbitrary weights? I get the point of them all, but what do they actually mean on their own? And better yet, how can you convince someone that, for example, TOI/Gm is twice as important as tCF60 on its own? It seems obvious, yeah, but it's still important to reason the weights.

36kap36 is offline   Reply With Quote
Old
06-17-2016, 12:59 PM
  #7
eyjee
the basement
 
eyjee's Avatar
 
Join Date: Jun 2008
Location: Alphabet Soup
Country: United States
Posts: 27,231
vCash: 500
Quote:
Originally Posted by 36kap36 View Post
Are these just arbitrary weights? I get the point of them all, but what do they actually mean on their own? And better yet, how can you convince someone that, for example, TOI/Gm is twice as important as tCF60 on its own? It seems obvious, yeah, but it's still important to reason the weights.
Completely arbitrary.

eyjee is offline   Reply With Quote
Old
06-18-2016, 03:49 AM
  #8
uTurris
We the True North
 
uTurris's Avatar
 
Join Date: Apr 2014
Country: Canada
Posts: 3,113
vCash: 500
How they move their body when playing hockey.

uTurris is offline   Reply With Quote
Old
06-23-2016, 11:21 AM
  #9
36kap36
Registered User
 
36kap36's Avatar
 
Join Date: Jan 2011
Location: Ohio
Country: United States
Posts: 868
vCash: 500
Quote:
Originally Posted by silverfish View Post
Completely arbitrary.
I'd suggest some sort of method to get non-arbitrary weights. Maybe run a regression on a list of players you already think are similar, based on your current stats plus more that you might not think are necessary? That is still somewhat arbitrary to start, yes, and you need to have a large enough set to run the regression, but then I believe you'd be on the right track.

I'd be happy to help out on compiling the data set, if you'd want help. If not, just my two cents!

36kap36 is offline   Reply With Quote
Old
06-23-2016, 11:25 AM
  #10
eyjee
the basement
 
eyjee's Avatar
 
Join Date: Jun 2008
Location: Alphabet Soup
Country: United States
Posts: 27,231
vCash: 500
Quote:
Originally Posted by 36kap36 View Post
I'd suggest some sort of method to get non-arbitrary weights. Maybe run a regression on a list of players you already think are similar, based on your current stats plus more that you might not think are necessary? That is still somewhat arbitrary to start, yes, and you need to have a large enough set to run the regression, but then I believe you'd be on the right track.

I'd be happy to help out on compiling the data set, if you'd want help. If not, just my two cents!
I've got the data in place, AFAIK, I'd be more interested in how to run the regressions to get the weights.

Cordially,
Someone who wishes they went to college after the increase of statistical influence in hockey became mainstream

eyjee is offline   Reply With Quote
Old
07-07-2016, 08:39 PM
  #11
eyjee
the basement
 
eyjee's Avatar
 
Join Date: Jun 2008
Location: Alphabet Soup
Country: United States
Posts: 27,231
vCash: 500
I've made some changes in fears of this being an overfit model. Now using:

iCF60 (2.5)
Percent of Shots Taken (2.5)
Primary Points per 60 (3.5)
TOI per game (2)
tCF60 (1)
Goals (5.5)
A1 (4)

I wonder if it would be smarter to use goals per 60 and primary assists per 60.

eyjee is offline   Reply With Quote
Old
07-13-2016, 08:55 PM
  #12
ya dad
ur a joke
 
ya dad's Avatar
 
Join Date: Feb 2015
Location: Ridley College
Country: Finland
Posts: 876
vCash: 999
Quote:
Originally Posted by silverfish View Post
I've made some changes in fears of this being an overfit model. Now using:

iCF60 (2.5)
Percent of Shots Taken (2.5)
Primary Points per 60 (3.5)
TOI per game (2)
tCF60 (1)
Goals (5.5)
A1 (4)

I wonder if it would be smarter to use goals per 60 and primary assists per 60.
that model looks good however

__________________
ya dad is offline   Reply With Quote
Old
07-15-2016, 12:03 AM
  #13
eperry
Registered User
 
Join Date: Jun 2016
Posts: 45
vCash: 500
Given a reasonably reliable categorized subset of players, you could build a simple classification model (K-Nearest Neighbours, multinomial logistic) to define weights. I've tried this in the past with crowd-sourced data using questionnaires.

eperry is offline   Reply With Quote
Old
07-15-2016, 09:37 AM
  #14
eyjee
the basement
 
eyjee's Avatar
 
Join Date: Jun 2008
Location: Alphabet Soup
Country: United States
Posts: 27,231
vCash: 500
Quote:
Originally Posted by eperry View Post
Given a reasonably reliable categorized subset of players, you could build a simple classification model (K-Nearest Neighbours, multinomial logistic) to define weights. I've tried this in the past with crowd-sourced data using questionnaires.
Think I need to sit down and teach myself https://en.wikipedia.org/wiki/K-near...bors_algorithm as it does seem to be what I'm looking for.

Thanks!

eyjee is offline   Reply With Quote
Reply

Forum Jump


Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -5. The time now is 05:57 AM.

monitoring_string = "e4251c93e2ba248d29da988d93bf5144"

vBulletin Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
HFBoards.com is a property of CraveOnline Media, LLC, an Evolve Media, LLC company. 2017 All Rights Reserved.