HFBoards

Go Back   HFBoards > General Hockey Discussion > By The Numbers
Mobile Hockey's Future Become a Sponsor Site Rules Support Forum vBookie Page 2
By The Numbers Hockey Analytics... the Final Frontier. Explore strange new worlds, to seek out new algorithms, to boldly go where no one has gone before.

Adjusted Even-Strength Plus-minus 1968-2008

Reply
 
Thread Tools
Old
08-21-2011, 09:34 AM
  #176
overpass
Registered User
 
Join Date: Jun 2007
Posts: 3,618
vCash: 500
I've edited the OP to add the most recent adjusted plus-minus numbers I have, using my current method of calculating, for anyone who is interested.

Generally, the updated numbers are slightly more positive towards players on good teams, since the team adjustment is a little weaker.

I've also changed the SHGF estimator to include actual SH points, so players who scored a lot of SH points have had those moved out of the ESGF column and into the SHGF column. This change is small to non-existent for most forwards, and probably only affects Paul Coffey and Mark Howe among defencemen. It has a major effect on Wayne Gretzky's numbers.

matnor, I see what you mean about the Langway numbers being off. I don't have the 2008 numbers anymore to replace them with, so I just deleted him from the first table and included him in the second table.

plusminus, I'm still planning to get to what you posted. Not ignoring you, just busy

overpass is offline   Reply With Quote
Old
08-21-2011, 07:38 PM
  #177
Czech Your Math
Registered User
 
Czech Your Math's Avatar
 
Join Date: Jan 2006
Location: bohemia
Country: Czech_ Republic
Posts: 3,717
vCash: 500
Quote:
Originally Posted by plusandminus View Post
I've studied correlations between ES icetime and ESGF+ESGA and they don't necessarily correspond very well, perhaps especially regarding forwards. For example, during 2002-03, I guy like Kowalchuk (who had a reputation as an offensive minded player) had about 1.5 times more ESGF+ESGA played than the average player.
So here we thus have our first bias. ESGF=60 and ESGA=50, will give higher "ice time" than ESGF=50 and ESGA=40, even if real ice time is the same. ?
I'm not sure ice time is so important. If a player is on ice for a much higher % of ES GF+GA than his % of his ice time, that's okay. If he's able to perform at or above the GF/GA ratio of the team as a whole, then the more volume the better for the team. If he's at a worse GF/GA level, then the high volume will negatively effect the player portion of the metric even more.

Quote:
Originally Posted by plusandminus View Post
I'm a bit against dividing ESGF by ESGA. As i wrote in another reply some day ago, I think there are better ways to do it. To use the above example, I would say that 60-50 and 50-40 is equally good, despite the latter one getting a slightly higher ratio (if I understand you right).
I don't know why you are so against calculating GF/GA ratios. They shouldn't be used randomly, but in this case they are the primary basis of the pythagorean win% calculation, so of great importance.

Whether 60/50 or 50/40 is better may depend most on context (all other things being equal). On a bad team, the extra 10/10 might be helpful, while on a good team, it may be hurtful.

Quote:
Originally Posted by plusandminus View Post
I googled, and according to wikipedia I got the impression that rather 1.8 was the "right exponent", at least in baseball?

Wouldn't shootout goals be excluded from the stats?
I saw more than one study for hockey. If 1.8 or whatever number is deemed a solid number, I have not attachment to 2.0 as exponent. It does vary by sport though, I think mainly due to differing scoring levels.

Quote:
Originally Posted by plusandminus View Post
That's 94-38 = +56 with Forsberg. And 84-87 = -3 without.
Not only did he have the league's by far best ES+/-, and scored the highest amount of ESpts, on a team that without him (and the guys who were on the ice with him) had negative +/-.

Adding ESGF+ESGA, we get 94+38=132 for him. 178+125=303 for the team.
That's an "ice time" of 43.56 % according to ESGF+ESGA.

I got 43.56 %, so at least one of us (perhaps I) may be wrong.

In reality, the correct answer seems to be 28.84 %. (if my data is correct)
So, the estimated percentage is about 1.5 times higher.
You are correct. I have realized yet another error in this hastily put together study. I had calculated player's % of team's GF and GA separately and then summed them, instead of summing them before calculating the player's % of team. I came up with this idea a couple days ago and used some existing player data for the basis of most of the calculations, but that doesn't excuse my sloppiness.

Nice job of checking my math!

Quote:
Originally Posted by plusandminus View Post
And I think three players with identical EStime, having ESGF-ESGA of 40-40 and 30-30 and 20-20 contributes equally much.


Quote:
Originally Posted by plusandminus View Post
It would be interesting to this stat listed for all the players on a team.
(I can do it myself, but not right now.)
With the risk of being called an idiot, does the sum of all players equal 100??
I don't have calculations for an entire team. I think the totals would not exactly balance, but should not be way off either.

The total for all players on the team should be somewhere around:

5 * 82 * (team's ES pythagorean win%)

note: 5 is number of skaters per goal

For a team with equal ES GF and GA, should be ~205 "ES wins"

Quote:
Originally Posted by plusandminus View Post
Pittsburgh were a bit special that year, starting the season with some very good players, just to see them drop off one by one. So Mario's stats sank deeper and deeper during the season. (If I remember right.)
Yeah, Pittsburgh was real "special" for a few years there post-Jagr.

In '94 and '96, Lemieux's R-On was slightly less than R-Off, although a big reason for that is Jagr being such a large part of the R-Off. His R-On/R-Off is great in '97 (1.97) and very good in 2001 (1.39), but a lot of that was due to those being the 1.5 seasons he played with Jagr at even strength. After that, he was mostly weak.

Quote:
Originally Posted by plusandminus View Post
A thought I have, is that one might want to seperate forwards and defencemen, since they may not be easily comparable.

An additional way of improving (or not) the method, could be to include ESpts in the calculations, to estimate how much different players contributed to their ESGF. Now I'm mainly thinking of doing it for forwards, to help seperate the offensive contributions of linemates, although it might be useful to apply (perhaps in a differnt form) to defencemen as well. To do something similar for ESGA would of course be basically impossible (unless one apply an assumption like "defencemen being more responsible for ESGA, while forwards being more responsible for ESGF").
Yes, separating forwards and defensemen is one possibilty. Would rather not do that... and what about players like Coffey that could almost be classified as either? I like using a player's ES points as a % of ESGF, but this requires even more data and doesn't address ESGA. I think the latter is probably the better way to go, or it may be better to just live with a "pure" but flawed metric.

Quote:
Originally Posted by plusandminus View Post
Finally, which you likely are aware of, this stat only tells us about players' contributions during ES. So the "rankings" here are ES only.
Creating similar stats for PP and SH would rank players differently.
For example. While Forsberg had "much better" ES stats than Naslund in 2002-03, Naslund had better PP stats.
Of course this, like adjusted plus-minus, isn't an all-encompassing metric. It's meant to shed light on even strength value. Still, about ~75% of goals occur at even strength and it's even strength play that leads to penalties, so ES play is crucial to overall value.

It's not meant to measure all aspects of the game.

Czech Your Math is offline   Reply With Quote
Old
08-22-2011, 10:36 AM
  #178
plusandminus
Registered User
 
Join Date: Mar 2011
Posts: 980
vCash: 500
Quote:
I don't know why you are so against calculating GF/GA ratios. They shouldn't be used randomly, but in this case they are the primary basis of the pythagorean win% calculation, so of great importance.
I'll try to explain.

If I was asked to rank the following seasonal ES stats for players, without paying any attention at all to context, I would rank them as follow (with a tie for 2nd best):

GF-GA GD GF/GA GF+GA (GF/GA)*(GF+GA)
72-50 +22 1.440 112 161.28
60-40 +20 1.500 100 150
40-20 +20 2.000 60 120
45-30 +15 1.500 75 112.5
7- 4 + 3 1.750 11 19.25
3- 1 + 2 3.000 4 12
GD=GF-GA (goal difference). GS=GF+GA (goal sum).

Comments:
1. The guy with a GF/GA of 3.000 looks far too good compared to the others.
2. The lower numbers, the more extreme GF/GA. (It's a bit like pts per game. The fewer games played, the more extreme points per game.)

That's why I generally think one should be careful with using GF/GA.

Also:
3. Player ice time share during ES vary a lot between players. So does the amount time the player was not on the ice. No matter if one use real ice times, or take GF+GA, the differences are big.
4. Thus, when comparing "with" and "without", we would be comparing for example a GF/GA based on very low numbers, with a GF/GA based on very high numbers.

I'm not convinced yet regarding how good the win formula, and other formulas are at handling the things I mentioned above. Maybe they are great.

Quote:
Originally Posted by Czech Your Math View Post
I'm not sure ice time is so important. If a player is on ice for a much higher % of ES GF+GA than his % of his ice time, that's okay. If he's able to perform at or above the GF/GA ratio of the team as a whole, then the more volume the better for the team. If he's at a worse GF/GA level, then the high volume will negatively effect the player portion of the metric even more.
The above seems based a lot on GF/GA, and I think GF/GA can "lie".

Let's say we have a 2-3 result without player on ice (GF/GA=0.400).
Player on ice doing 5-3 will make his team win 7-6, despite GF/GA=1.667.
Player on ice doing 2-1will only make his team draw 4-4, despite GF/GA of 2.00.

As I said, maybe the win formula and other formulas have methods to guard for such contradictions.

Quote:
Whether 60/50 or 50/40 is better may depend most on context (all other things being equal). On a bad team, the extra 10/10 might be helpful, while on a good team, it may be hurtful.
By themselves, I think both are equal. Context may make one look better, but I'm not sure GF/GA is the best way to determine that.
I may be wrong.

plusandminus is offline   Reply With Quote
Old
08-22-2011, 02:05 PM
  #179
Czech Your Math
Registered User
 
Czech Your Math's Avatar
 
Join Date: Jan 2006
Location: bohemia
Country: Czech_ Republic
Posts: 3,717
vCash: 500
Quote:
Originally Posted by plusandminus View Post
If I was asked to rank the following seasonal ES stats for players, without paying any attention at all to context, I would rank them as follow (with a tie for 2nd best):

GF-GA GD GF/GA GF+GA (GF/GA)*(GF+GA)
72-50 +22 1.440 112 161.28
60-40 +20 1.500 100 150
40-20 +20 2.000 60 120
45-30 +15 1.500 75 112.5
7- 4 + 3 1.750 11 19.25
3- 1 + 2 3.000 4 12
GD=GF-GA (goal difference). GS=GF+GA (goal sum).

Comments:
1. The guy with a GF/GA of 3.000 looks far too good compared to the others.
2. The lower numbers, the more extreme GF/GA. (It's a bit like pts per game. The fewer games played, the more extreme points per game.)

That's why I generally think one should be careful with using GF/GA.
The metric I've been tinkering with does not use any formula akin to (GF/GA)*(GF+GA). Neither does Overpass' adjusted plus-minus. Also, I agree that small sample sizes tend to lead to skewed results. That is why taking the best X seasons or career numbers are going to be more reliable for almost any metric.

In your 60/40 vs. 40/20 example, context is very important. First, it tells you in what environment the data was created. Second, it tells you what impact the player's performance is going to have. Since one player has 20/20 more than the other, if the R-Off of his team was > 1.0, then his performance did not help his team, while if it was < 1.0 it did help his team.

Quote:
Originally Posted by plusandminus View Post
Also:
3. Player ice time share during ES vary a lot between players. So does the amount time the player was not on the ice. No matter if one use real ice times, or take GF+GA, the differences are big.
4. Thus, when comparing "with" and "without", we would be comparing for example a GF/GA based on very low numbers, with a GF/GA based on very high numbers.

I'm not convinced yet regarding how good the win formula, and other formulas are at handling the things I mentioned above. Maybe they are great.
Again, that's why need multiple seasons to have any real solid data.

Maybe I should focus more or solely on the player's portion of the formula. I came up with the distribution of team's ES wins while thinking of some way to address Overpass' concern that players on great teams are hampered by the team's strong R-OFF. Honestly, getting credit for "just showing up" is not that great, although it's actually "playing a lot for great teams", and usually it's very good players who get lots of ice time over many years on great teams.

The player's portion is calculated by deducting his ES goals for/against from the team totals. The better the ratio and the more goals he was on ice for, the more impact it will have on the estimated win% differential, but it's more complex than multiplying the GF/GA ratio by the sum of ES GF + GA.

Quote:
Originally Posted by plusandminus View Post
The above seems based a lot on GF/GA, and I think GF/GA can "lie".

Let's say we have a 2-3 result without player on ice (GF/GA=0.400).
Player on ice doing 5-3 will make his team win 7-6, despite GF/GA=1.667.
Player on ice doing 2-1will only make his team draw 4-4, despite GF/GA of 2.00.

As I said, maybe the win formula and other formulas have methods to guard for such contradictions.
All stats can "lie", 76% of statisticians can attest to that.

I'm not using GF/GA ratio as an absolute metric. In referring to Overpass' adjusted plus-minus, I do think R-ON/R-OFF is a valuable metric. It tells you in % terms how much more effective the team was with that player on the ice than without him on the ice, and that's a valuable piece of information.

Quote:
Originally Posted by plusandminus View Post
By themselves, I think both are equal. Context may make one look better, but I'm not sure GF/GA is the best way to determine that.
I may be wrong.
They are similar, but not equal in most cases. An extra 10 GF and 10 GA is outstanding on the '75 Capitals and rather weak on a dynasty team.

Czech Your Math is offline   Reply With Quote
Old
08-22-2011, 02:38 PM
  #180
Czech Your Math
Registered User
 
Czech Your Math's Avatar
 
Join Date: Jan 2006
Location: bohemia
Country: Czech_ Republic
Posts: 3,717
vCash: 500
Quote:
Originally Posted by overpass View Post
Notice that the first list has 9 of the top 10 with an R-OFF below 1. On the second list, 9 of the top 10 have an R-OFF above 1.
Thanks for posting those, I like seeing the "pure" list.

On second list, six players in top 50 with R-OFF < 1:

Bourque, Jagr, M. Howe, Lindros, Thornton, Selanne

Rather underrated group for the most part.

While Dionne, Forsberg and Lindros all helped linemates make the list, it's especially impressive to see Jagr help two separate centers (Francis, Nylander) make the list.

Perhaps I should stay with a "pure" list and just use estimated win% (player portion of ES value)? I think that would look similar to your second (100% adjusted) list.


Last edited by Czech Your Math: 08-22-2011 at 02:52 PM.
Czech Your Math is offline   Reply With Quote
Old
08-22-2011, 04:46 PM
  #181
plusandminus
Registered User
 
Join Date: Mar 2011
Posts: 980
vCash: 500
Quote:
Originally Posted by Czech Your Math View Post
...
It seems I got a bit misunderstood when I included a (GF/GA)*(GF+GA) column. That column was just there as an example of what results it would show. It was the other columns that were the important ones. Below I have deleted that column.

GF-GA GD GF/GA GF+GA
72-50 +22 1.440 112
60-40 +20 1.500 100
40-20 +20 2.000 60
45-30 +15 1.500 75
7- 4 + 3 1.750 11
3- 1 + 2 3.000 4
GD=GF-GA (goal difference). GS=GF+GA (goal sum).

Quote:
Originally Posted by Czech Your Math View Post
Also, I agree that small sample sizes tend to lead to skewed results. That is why taking the best X seasons or career numbers are going to be more reliable for almost any metric.
Yes. But has it been tested out how much more reliable?
I may try to examine that a bit more.

I also intend to look at game by game to see just what ES result the player had in the game ("with"), and what ES result the team had with him off ice ("without").
I know this can only be done for recent seasons, unless one wants to rely on estimated ES stats, but I still think it would be interesting to see what results it will produce. I'll get GP W D L GF-GA Pts for the players ("with") and for "without" them, and can then compare the two.

Quote:
In your 60/40 vs. 40/20 example, context is very important. First, it tells you in what environment the data was created. Second, it tells you what impact the player's performance is going to have. Since one player has 20/20 more than the other, if the R-Off of his team was > 1.0, then his performance did not help his team, while if it was < 1.0 it did help his team.
I understand your example here. But I so far don't like GF/GA to be used, for reasons I have tried to put forward.

Quote:
The player's portion is calculated by deducting his ES goals for/against from the team totals. The better the ratio and the more goals he was on ice for, the more impact it will have on the estimated win% differential
I understand that. That's what I call "with" and "without". And "with" may be unproportional compared to "without".

I will experiment a bit on my own to see how much it may affect the results.

Quote:
All stats can "lie", 76% of statisticians can attest to that.
That was a funny statement.

Quote:
They are similar, but not equal in most cases. An extra 10 GF and 10 GA is outstanding on the '75 Capitals and rather weak on a dynasty team.
Yes, I understand that.

plusandminus is offline   Reply With Quote
Old
08-22-2011, 07:59 PM
  #182
Czech Your Math
Registered User
 
Czech Your Math's Avatar
 
Join Date: Jan 2006
Location: bohemia
Country: Czech_ Republic
Posts: 3,717
vCash: 500
Quote:
Originally Posted by plusandminus View Post
IYes. But has it been tested out how much more reliable?
I may try to examine that a bit more.
The "pure" part is the player portion. Taking the differential of the estimated pythagorean win% based on performance. What part of that do you disagree with? I understand the exponent is a bit difficult to pinpoint, but I don't think it makes much difference when comparing players, since all would be affected similarly.

The assignment of some portion of the team's success based is somewhat arbitrary and perhaps not necessary at all.

Quote:
Originally Posted by plusandminus View Post
I also intend to look at game by game to see just what ES result the player had in the game ("with"), and what ES result the team had with him off ice ("without").
I know this can only be done for recent seasons, unless one wants to rely on estimated ES stats, but I still think it would be interesting to see what results it will produce. I'll get GP W D L GF-GA Pts for the players ("with") and for "without" them, and can then compare the two.
Are you talking about ES data in games the player did not play? If you have that data, it would be interesting. However, if that's what you mean, why not just look at total results (record, GF, GA). I've calculated the actual vs. expected win% for a few players, and could also calculated expected win % using pythagorean based on GF/GA.

If you're bringing ice time into the picture, I don't consider that very important in comparison.

Quote:
Originally Posted by plusandminus View Post
I understand that. That's what I call "with" and "without". And "with" may be unproportional compared to "without".

I will experiment a bit on my own to see how much it may affect the results.
With and without, yes it's a fairly simple concept.
I don't understand what you mean by unproportional.

Czech Your Math is offline   Reply With Quote
Old
08-22-2011, 08:31 PM
  #183
plusandminus
Registered User
 
Join Date: Mar 2011
Posts: 980
vCash: 500
Below is an example to illustrate the principle of looking at every single game to determine ES contributions. Everything in the table is ES only. 2002-03 season.

x=without player. tot (or t)=with player+without player
W D L = won draw loss
Pts=Pts, win=2, draw=1, loss=0
I need to keep table names short. Ask if unclear.

I think only games where the player were on the ice on a goal (no matter what type of goal) are counted. The missing games are all 0-0 for the player.
(Better would be to include all the games the player participated in, but unfortunately that informations seems to be depleted because of a disk crash several years ago.)

Just an example.

TeamPosNameGame+/-x+/-PtsxPtstotPtsPts-xtot-xWDLxWxDxLtWtDtL
COLCPETER FORSBERG 685519761913630401711211928391316
COLRMILAN HEJDUK 7349-19872942622372412271828411220
LA RZIGMUND PALFFY 7124-268653713318302615171935252125
DALRJERE LEHTINEN 643321897286171434219271819341812
DALDDERIAN HATCHER 752727908599514361821322122401916
TB RMARTIN ST. LOUIS 707-197854672413292021162232251728
COLLALEX TANGUAY 6938108876891213351816311424381318
DETDNICKLAS LIDSTROM 803629584971113372122312227401723
NJ RJAMIE LANGENBRUNNER 591614666275413251618221819321116
How to read the table:
Forsberg was +55 during ES, without him Colorado was +1 during ES. Forsberg was 40-11-17 (W-L-D) during ES. That would have resulted in 97 pts in 68 games. It seems he (and his units) helped getting Colorado 30 more "ES points" (91 instead of 61). Again, everything is ES only, and only games where player where on the ice on a goal is counted.

Sorted another way, it would look like:

TeamPosNameGame+/-x+/-PtsxPtstotPtsPts-xtot-xWDLxWxDxLtotWtotDtotL
CBJRDAVID VYBORNY 5915-637132373952423129143614936
COLCPETER FORSBERG 685519761913630401711211928391316
LA RZIGMUND PALFFY 7124-268653713318302615171935252125
CARDSEAN HILL 665-446942522710222519112035201234
COLRMILAN HEJDUK 7349-19872942622372412271828411220
MTLDANDREI MARKOV 7118-23886266264322415202229251630
TB RMARTIN ST. LOUIS 707-197854672413292021162232251728
CBJLGEOFF SANDERSON 71-1-5670474223-5203021132137141443
LA LALEXANDER FROLOV 5816-166948592111281317191029221521
TB CVACLAV PROSPAL 707-21816067217272716162826251728

The difference between Vyborny on the ice, and Vyborny off the ice, is hugh. +15 with, -63 without. With him, 71 points in 59 games, without him only 32 points in 59 games. His contributions however only helped Columbus get 5 more points more than if he had been +/- 0 in every game. Same data and limitations as above.

Just an example. Just intended as something to add to the debate.
I know important stats are missing for older data. Again, just an example.

Pts, xPts and totPts can be used to tell us about pts per game:
TeamPosNameGameES+/-xES+/-PtsxPtstotPtsPts-xPtstotPts-xPts
ATLCKAMIL PIROS 24-22.0000.0001.5002.0001.500
WASDMICHAEL FARRELL 11-12.0000.0001.0002.0001.000
NASDTOMAS KLOUCEK 11-12.0000.0001.0002.0001.000
VANRPAT KAVANAGH 22-22.0000.0001.0002.0001.000

I personally find totals more useful.

Results does seem more useful if for example only taking those with minimum of 41 games:
TeamPosNameGameES+/-xES+/-PtsxPtstotPtsPts-xPtstotPts-xPts
COLCPETER FORSBERG 685511.4260.8971.3380.5290.441
COLRMILAN HEJDUK 7349-11.3420.9861.2880.3560.301
LA RZIGMUND PALFFY 7124-261.2110.7461.0000.4650.254
NJ RJAMIE LANGENBRUNNER 5916141.1191.0511.2710.0680.220
DALRJERE LEHTINEN 6433211.3911.1251.3440.2660.219
PHOLLADISLAV NAGY 6224-101.2100.9521.1450.2580.194
LA LALEXANDER FROLOV 5816-161.1900.8281.0170.3620.190
COLLALEX TANGUAY 6938101.2751.1011.2900.1740.188
DALDDERIAN HATCHER 7527271.2001.1331.3200.0670.187
STLRERIC BOGUNIECKI 4311-101.1860.7210.9070.4650.186

I prefer the totals (first two tables in this post) more than the averages.

Perhaps, though, "Pts as a total" might be wisely combined with "xPts per game".

Dividing, or dividing with square root, would give:
TeamPosNameGameES+/-xES+/-PtsPtsGxPtstotPtstotPts2totPts3
COLCPETER FORSBERG 68551971.4260.8971.338108.13114102.41445
LA RZIGMUND PALFFY 7124-26861.2110.7461.000115.2075499.538178
COLRMILAN HEJDUK 7349-1981.3420.9861.28899.36111198.678208
CBJRDAVID VYBORNY 5915-63711.2030.5420.627130.9062596.407176
MTLDANDREI MARKOV 7118-23881.2390.8730.930100.774194.170744
STLDAL MACINNIS 7819-3921.1790.9621.03895.68000093.821959
DETDNICKLAS LIDSTROM 80362951.1881.0501.21290.47619092.710506
TB CBRAD RICHARDS 758-17821.0930.8400.97397.61904789.469334
STLDBARRET JACKMAN 7417-6871.1760.9591.01490.67605688.819012
TB RMARTIN ST. LOUIS 707-19781.1140.7710.957101.1111188.806906
NYRCERIC LINDROS 7515-3871.1600.9601.05390.62588.794003
WASDSERGEI GONCHAR 7517-5861.1470.9601.04089.58333387.773382
TotPts2=Pts/(xPts/Games). TotPts3=Pts/sqrt(xPts/Games)

I'm not saying this last table above gives the best results, just that it uses another way of combining "with" and "without".

Maybe the results of the "win formula" and/or overpass' method would be fairly similar (or not).

I think this is an interesting way to look at things. I know one needs some data that are not available for "old" seasons, but still. And the results here may be compared to the other methods in this thread, to see how they correspond to each other.


Edit:
The above should basically work for all eras, no matter how high or low scoring. Only thing to adjust for would be GP per season (for teams, i.e. 82 nowadays, 80 during Gretzy's prime).
It should also take care of injuries. If a player misses a game, he simply gets 0 pts for that game. It should be very easy to aggregate different seasons to get career totals (if it wasn't for necessary data missing).


Last edited by plusandminus: 08-23-2011 at 08:47 AM. Reason: spelling
plusandminus is offline   Reply With Quote
Old
08-23-2011, 12:40 PM
  #184
seventieslord
Moderator
 
seventieslord's Avatar
 
Join Date: Mar 2006
Location: Regina, SK
Country: Canada
Posts: 24,955
vCash: 500
Quote:
OK.
I've studied correlations between ES icetime and ESGF+ESGA and they don't necessarily correspond very well, perhaps especially regarding forwards. For example, during 2002-03, I guy like Kowalchuk (who had a reputation as an offensive minded player) had about 1.5 times more ESGF+ESGA played than the average player.
So here we thus have our first bias. ESGF=60 and ESGA=50, will give higher "ice time" than ESGF=50 and ESGA=40, even if real ice time is the same. ?
I'm surprised CYM didn't mention/ask this, but... did you mean he had about 1.5 times the "average" player? or the average of his own team? there are only so many minutes to go around, and I find it unlikely that he would be that far off of his own team's average. If you compare him to another team, sure, his icetime figures would look wonky. but that's not what a model that uses GF & GA to calculate icetime does.

Also, Kovalchuk is about the most extreme example of a "high risk, high reward" type player. Watching the guy play it is clear he is going to cause both more goals for and more goals against while on the ice. If there was ever a player who would be an outlier whose GF/GA figures might cause an estimation model to lie, it would be him. With that said, that's no reason to throw out such a model.

Quote:
If I was asked to rank the following seasonal ES stats for players, without paying any attention at all to context, I would rank them as follow (with a tie for 2nd best):

GF-GA GD GF/GA GF+GA (GF/GA)*(GF+GA)
72-50 +22 1.440 112 161.28
60-40 +20 1.500 100 150
40-20 +20 2.000 60 120
45-30 +15 1.500 75 112.5
7- 4 + 3 1.750 11 19.25
3- 1 + 2 3.000 4 12
GD=GF-GA (goal difference). GS=GF+GA (goal sum).

Comments:
1. The guy with a GF/GA of 3.000 looks far too good compared to the others.
2. The lower numbers, the more extreme GF/GA. (It's a bit like pts per game. The fewer games played, the more extreme points per game.)

That's why I generally think one should be careful with using GF/GA.
I have an even more compelling reason to throw out the bottom player - any calculation done on him would be based on an obscenely low number of game situations. It's a poor sample size and is practically meaningless. Of course, if it got further in the season and he was 40-13, then we'd have something to talk about, and whether he was outperforming the 72-50 player would be a worthy question to ask.

seventieslord is offline   Reply With Quote
Old
08-23-2011, 02:46 PM
  #185
plusandminus
Registered User
 
Join Date: Mar 2011
Posts: 980
vCash: 500
Quote:
Originally Posted by seventieslord View Post
I'm surprised CYM didn't mention/ask this, but... did you mean he had about 1.5 times the "average" player? or the average of his own team? there are only so many minutes to go around, and I find it unlikely that he would be that far off of his own team's average. If you compare him to another team, sure, his icetime figures would look wonky. but that's not what a model that uses GF & GA to calculate icetime does.
Thanks for asking. I compare only with his team's totals.

Below are what it looks like. I have only included players with GF+GA >= 20. (Expect even larger differences for the other players.)
I think the raw data I assembled and corrected should be near 100 % correct.
The rightmost column shows GF+GA per 60 minutes. I have divided all those values by league average, to make it easier to compare.

Highest:
TeamNameposESTOIEStoi%ESGFGA%diffESGFGAnormESGFGAper60min
CARBRUNO ST. JACQUES D2970.0810.1380.057351.577
CHIBURKE HENRY D2500.0650.0940.029291.552
ATLILYA KOVALCHUK F11530.3050.3790.0741301.509
NYRREM MURRAY F3980.1050.1550.050441.479
EDMMIKE COMRIE F9100.2410.3280.0861001.470
CHIERIC DAZE F7220.1880.2570.070791.464
LA ADAM DEADMARSH F2650.0710.1060.035291.464
COLPETER FORSBERG F10910.2880.4190.1311191.459
BOSJOE THORNTON F12850.3400.4330.0931401.458
ATLKIRILL SAFRONOV D4140.1100.1310.022451.454
TORALEXANDER MOGILNY F9430.2580.3480.0901021.447
TORKAREL PILAR D2450.0670.0890.022261.420
BOSGLEN MURRAY F14000.3710.4580.0871481.414
ATLDANY HEATLEY F11640.3080.3590.0511231.414
NYIROMAN HAMRLIK D12810.3550.4600.1061341.400
PITMARIO LEMIEUX F10820.2860.3870.1011131.397
MTLNIKLAS SUNDSTROM F3840.0980.1290.031401.394
PHOLANDON WILSON F3280.0890.1250.036341.387
SJ NICHOLAS DIMITRAKOS F2440.0650.0890.023251.371
BOSMIKE KNUBLE F10610.2810.3340.0531081.362

Lowest:
TeamNameposESTOIEStoi%ESGFGA%diffESGFGAnormESGFGAper60min
WASBRIAN SUTHERBY F5540.1460.096-0.05270.652
PITIAN MORAN F10530.2780.175-0.10510.648
CGYSTEVE BEGIN F4360.1180.078-0.04210.644
STLTYSON NASH F5320.1420.087-0.05250.629
TORRICHARD JACKMAN D5350.1460.085-0.06250.625
FLAIGOR ULANOV D8230.2150.146-0.07380.618
MINJASON MARSHALL F4620.1190.084-0.04210.608
TB ALEXANDER SVITOV F5110.1320.083-0.05230.602
CARJAROSLAV SVOBODA F5490.1500.095-0.05240.585
TORPAUL HEALEY F4580.1250.068-0.06200.584
BUFVACLAV VARADA F5250.1390.082-0.06220.561
CGYSTEVE MONTADOR D6270.1700.096-0.07260.555
STLSHJON PODEIN F5400.1440.073-0.07210.520
COLSERGE AUBIN F7000.1850.095-0.09270.516
PITIAN MORAN D10530.2780.134-0.14390.496

Biggest differences between real ES%, and ES% estimated by GF+GA:
TeamNameposESTOIEStoi%ESGFGA%diffESGFGAnormESGFGAper60min
PITIAN MORAN D10530.2780.134-0.14390.496
COLPETER FORSBERG F10910.2880.4190.1311191.459
NYIMATTIAS TIMANDER D11590.3210.213-0.11620.716
NYIROMAN HAMRLIK D12810.3550.4600.1061341.400
PITMARIO LEMIEUX F10820.2860.3870.1011131.397
PHOOSSI VAANANEN D10560.2870.192-0.10520.659
PITIAN MORAN F10530.2780.175-0.10510.648
STLPAVOL DEMITRA F11510.3080.4060.0981161.348
TB VINCENT LECAVALIER F11340.2930.3900.0971081.274
BOSJOE THORNTON F12850.3400.4330.0931401.458
WASSERGEI GONCHAR D15840.4170.5090.0921431.208
TORALEXANDER MOGILNY F9430.2580.3480.0901021.447
CGYSTEPHANE YELLE F10540.2850.200-0.09540.686
COLSERGE AUBIN F7000.1850.095-0.09270.516
BOSGLEN MURRAY F14000.3710.4580.0871481.414
EDMMIKE COMRIE F9100.2410.3280.0861001.470
COLMILAN HEJDUK F12190.3220.4050.0831151.262
TB MARTIN ST. LOUIS F11230.2900.3720.0821031.227
FLAOLLI JOKINEN F11680.3050.3850.0801001.146
BOSSEAN O'DONNELL D11500.3050.229-0.08740.861
VANTODD BERTUZZI F12230.3340.4130.0791171.280
DETMATHIEU DANDENAULT D12030.3140.3920.0781221.357
PHOPAUL MARA D11290.3070.3840.0771041.233
FLAIVAN NOVOSELTSEV F9600.2500.3270.077851.185
MINMARIAN GABORIK F10780.2780.3550.077891.105
ATLILYA KOVALCHUK F11530.3050.3790.0741301.509


This makes, in my opinion, GF+GA a bit unreliable when using it to determine ice time. Maybe there are steps involved in the calculations that takes care of that, so that it really doesn't matter that much. But I felt a need to point it out.

Quote:
Also, Kovalchuk is about the most extreme example of a "high risk, high reward" type player. Watching the guy play it is clear he is going to cause both more goals for and more goals against while on the ice. If there was ever a player who would be an outlier whose GF/GA figures might cause an estimation model to lie, it would be him. With that said, that's no reason to throw out such a model.
Agree about Kowalchuk. But there were lots of other players like him. And plenty of the opposite kind too, i.e. players with "low risk getting scored on, but also low risk for their opponents to be scored on".


Quote:
I have an even more compelling reason to throw out the bottom player - any calculation done on him would be based on an obscenely low number of game situations. It's a poor sample size and is practically meaningless. Of course, if it got further in the season and he was 40-13, then we'd have something to talk about, and whether he was outperforming the 72-50 player would be a worthy question to ask.
I agree. Unfortunately, the lower numbers the more "extreme" results. And the higher numbers, the less "extreme" results. But problem is that that rule continues to be true even for high numbers. The 40-13 case will still be slightly more vulnerable than the 72-50 case (I think, but that case it might be negligable).

Anyway, I don't like dividing GF by GA, because to me it only makes sense when
GF+GA is equally big for every player.
(But maybe there is something built in in overpass' method, that takes care of that.)

I look at it like this:
A: Player being 40-20 actually helped his team improve by +20.
B: Player being 30-15 actually helped his team improve by +15.
Both have the same GF/GA (2.0). But I think what matters is GF-GA.
Let's say B was 30-12 instead, raising his GF/GA to 2.5. That may make him look like a better player when looking at GF/GA. But he would have helped his team improve by +18.
During the whole season, I think a player's goal is to help their team improve by as many goals as possible, by getting as good GF-GA as possible.


Last edited by plusandminus: 08-23-2011 at 03:21 PM.
plusandminus is offline   Reply With Quote
Old
08-23-2011, 03:26 PM
  #186
plusandminus
Registered User
 
Join Date: Mar 2011
Posts: 980
vCash: 500
I had to edit the part about (and with) the tables in my last post.
(I made a change before posting on the board, so that you would see if player played L, C, R, but that flawed the results, since some forwards played on more than one position, so I had to call all forwards F. If there is some guy playing both D and F, which is rare, the stats for that guy may be unreliable.)


Last edited by plusandminus: 08-23-2011 at 03:34 PM.
plusandminus is offline   Reply With Quote
Old
08-23-2011, 04:01 PM
  #187
overpass
Registered User
 
Join Date: Jun 2007
Posts: 3,618
vCash: 500
plusminus, I'll try to address your concerns about the use of GF/GA ratios.

First, my adjusted plus-minus metric does not use the player's on-ice GF/GA ratio (R-ON) in the calculations. Only the difference is used, as in regular plus-minus.

The R-OFF ratio is used to calculate the team adjustment component, in conjunction with the sum of GA and GF. So the adjustment scales with the number of GF and GA, just as with regular plus-minus

I really don't think your concerns about the use of ratios apply to this metric. The ratios are useful for a quick overview of the on/off ice relationship of the player and team, but are incomplete without a measure of quantity. Adjusted plus-minus includes a measure of quantity.

overpass is offline   Reply With Quote
Old
08-23-2011, 05:02 PM
  #188
plusandminus
Registered User
 
Join Date: Mar 2011
Posts: 980
vCash: 500
Quote:
Originally Posted by overpass View Post
plusminus, I'll try to address your concerns about the use of GF/GA ratios.

First, my adjusted plus-minus metric does not use the player's on-ice GF/GA ratio (R-ON) in the calculations. Only the difference is used, as in regular plus-minus.
Thanks for clarifying.

Quote:
The R-OFF ratio is used to calculate the team adjustment component, in conjunction with the sum of GA and GF. So the adjustment scales with the number of GF and GA, just as with regular plus-minus
I think I may understand. I think I perhaps need to try it out myself to be sure.

Quote:
I really don't think your concerns about the use of ratios apply to this metric. The ratios are useful for a quick overview of the on/off ice relationship of the player and team, but are incomplete without a measure of quantity. Adjusted plus-minus includes a measure of quantity.
OK.

Sorry if I may have gotten things wrong. I think I need to re-produce myself what you are doing, to be sure I've understood things right.

plusandminus is offline   Reply With Quote
Old
08-23-2011, 05:08 PM
  #189
seventieslord
Moderator
 
seventieslord's Avatar
 
Join Date: Mar 2006
Location: Regina, SK
Country: Canada
Posts: 24,955
vCash: 500
Quote:
Originally Posted by plusandminus View Post
Thanks for asking. I compare only with his team's totals.

Below are what it looks like. I have only included players with GF+GA >= 20. (Expect even larger differences for the other players.)
I think the raw data I assembled and corrected should be near 100 % correct.
The rightmost column shows GF+GA per 60 minutes. I have divided all those values by league average, to make it easier to compare.

Highest:
TeamNameposESTOIEStoi%ESGFGA%diffESGFGAnormESGFGAper60min
CARBRUNO ST. JACQUES D2970.0810.1380.057351.577
CHIBURKE HENRY D2500.0650.0940.029291.552
ATLILYA KOVALCHUK F11530.3050.3790.0741301.509
NYRREM MURRAY F3980.1050.1550.050441.479
EDMMIKE COMRIE F9100.2410.3280.0861001.470
CHIERIC DAZE F7220.1880.2570.070791.464
LA ADAM DEADMARSH F2650.0710.1060.035291.464
COLPETER FORSBERG F10910.2880.4190.1311191.459
BOSJOE THORNTON F12850.3400.4330.0931401.458
ATLKIRILL SAFRONOV D4140.1100.1310.022451.454
TORALEXANDER MOGILNY F9430.2580.3480.0901021.447
TORKAREL PILAR D2450.0670.0890.022261.420
BOSGLEN MURRAY F14000.3710.4580.0871481.414
ATLDANY HEATLEY F11640.3080.3590.0511231.414
NYIROMAN HAMRLIK D12810.3550.4600.1061341.400
PITMARIO LEMIEUX F10820.2860.3870.1011131.397
MTLNIKLAS SUNDSTROM F3840.0980.1290.031401.394
PHOLANDON WILSON F3280.0890.1250.036341.387
SJ NICHOLAS DIMITRAKOS F2440.0650.0890.023251.371
BOSMIKE KNUBLE F10610.2810.3340.0531081.362

Lowest:
TeamNameposESTOIEStoi%ESGFGA%diffESGFGAnormESGFGAper60min
WASBRIAN SUTHERBY F5540.1460.096-0.05270.652
PITIAN MORAN F10530.2780.175-0.10510.648
CGYSTEVE BEGIN F4360.1180.078-0.04210.644
STLTYSON NASH F5320.1420.087-0.05250.629
TORRICHARD JACKMAN D5350.1460.085-0.06250.625
FLAIGOR ULANOV D8230.2150.146-0.07380.618
MINJASON MARSHALL F4620.1190.084-0.04210.608
TB ALEXANDER SVITOV F5110.1320.083-0.05230.602
CARJAROSLAV SVOBODA F5490.1500.095-0.05240.585
TORPAUL HEALEY F4580.1250.068-0.06200.584
BUFVACLAV VARADA F5250.1390.082-0.06220.561
CGYSTEVE MONTADOR D6270.1700.096-0.07260.555
STLSHJON PODEIN F5400.1440.073-0.07210.520
COLSERGE AUBIN F7000.1850.095-0.09270.516
PITIAN MORAN D10530.2780.134-0.14390.496

Biggest differences between real ES%, and ES% estimated by GF+GA:
TeamNameposESTOIEStoi%ESGFGA%diffESGFGAnormESGFGAper60min
PITIAN MORAN D10530.2780.134-0.14390.496
COLPETER FORSBERG F10910.2880.4190.1311191.459
NYIMATTIAS TIMANDER D11590.3210.213-0.11620.716
NYIROMAN HAMRLIK D12810.3550.4600.1061341.400
PITMARIO LEMIEUX F10820.2860.3870.1011131.397
PHOOSSI VAANANEN D10560.2870.192-0.10520.659
PITIAN MORAN F10530.2780.175-0.10510.648
STLPAVOL DEMITRA F11510.3080.4060.0981161.348
TB VINCENT LECAVALIER F11340.2930.3900.0971081.274
BOSJOE THORNTON F12850.3400.4330.0931401.458
WASSERGEI GONCHAR D15840.4170.5090.0921431.208
TORALEXANDER MOGILNY F9430.2580.3480.0901021.447
CGYSTEPHANE YELLE F10540.2850.200-0.09540.686
COLSERGE AUBIN F7000.1850.095-0.09270.516
BOSGLEN MURRAY F14000.3710.4580.0871481.414
EDMMIKE COMRIE F9100.2410.3280.0861001.470
COLMILAN HEJDUK F12190.3220.4050.0831151.262
TB MARTIN ST. LOUIS F11230.2900.3720.0821031.227
FLAOLLI JOKINEN F11680.3050.3850.0801001.146
BOSSEAN O'DONNELL D11500.3050.229-0.08740.861
VANTODD BERTUZZI F12230.3340.4130.0791171.280
DETMATHIEU DANDENAULT D12030.3140.3920.0781221.357
PHOPAUL MARA D11290.3070.3840.0771041.233
FLAIVAN NOVOSELTSEV F9600.2500.3270.077851.185
MINMARIAN GABORIK F10780.2780.3550.077891.105
ATLILYA KOVALCHUK F11530.3050.3790.0741301.509


This makes, in my opinion, GF+GA a bit unreliable when using it to determine ice time. Maybe there are steps involved in the calculations that takes care of that, so that it really doesn't matter that much. But I felt a need to point it out.
To be honest, I think this reveals how effective a model that uses GF/GA can be.

they actaually do include an adjustment based on the line or pairing the player was on, because as you can see, goals are more frequent on a per-minute basis when the higher minute players are out there. So this will partially mitigate the effect that you're seeing, an effect that is pretty small to begin with.

I mean, if you want to try to approximate icetime across the board from 1967-onwards, be my guest. But can you improve on what exists enough that it is worth the time it would take?

As for Kovalchuk, considering two other Thrashers are in your top-20, he doesn't appear to really be that much of an outlier on his team. He just barely makes the bottom list that you provided, and only because it is based on a "raw" difference and not a percentage difference (if it was based on that, he would not be anywhere near the top)

Quote:
I agree. Unfortunately, the lower numbers the more "extreme" results. And the higher numbers, the less "extreme" results. But problem is that that rule continues to be true even for high numbers. The 40-13 case will still be slightly more vulnerable than the 72-50 case (I think, but that case it might be negligable).

Anyway, I don't like dividing GF by GA, because to me it only makes sense when
GF+GA is equally big for every player.
(But maybe there is something built in in overpass' method, that takes care of that.)

I look at it like this:
A: Player being 40-20 actually helped his team improve by +20.
B: Player being 30-15 actually helped his team improve by +15.
Both have the same GF/GA (2.0). But I think what matters is GF-GA.
Let's say B was 30-12 instead, raising his GF/GA to 2.5. That may make him look like a better player when looking at GF/GA. But he would have helped his team improve by +18.
During the whole season, I think a player's goal is to help their team improve by as many goals as possible, by getting as good GF-GA as possible.
Keep in mind that generally when we talk about r-on and r-ff we are talking about very large sample sizes - seasons at the very least, and generally blocks of seasons, or careers. Like you, I don't see any value in the micro aspect, these 10-20-game segments and single game scenarios you are providing.

overpass, as usual, did a much better job than I could have done, fuddling my way through descriptions of statistics in "normal people" words and terms.

seventieslord is offline   Reply With Quote
Old
08-23-2011, 07:02 PM
  #190
plusandminus
Registered User
 
Join Date: Mar 2011
Posts: 980
vCash: 500
Quote:
Originally Posted by overpass View Post
Glossary of Terms:

SFrac: Season Fraction. 1.00 is a full season. I prefer it to games played because it gives a 48 game season, a 74 game season, an 80 game season or an 82 game season the same weight.
$ESGF: Even-strength goals for, normalized to a 200 ESG scoring environment and with estimated SH goals removed.
$ESGA: Even-strength goals against, normalized to a 200 ESG scoring environment and with estimated SH goals removed.
R-ON: Even strength GF/GA ratio when the player is on the ice.
R-OFF: Even-strength GF/GA ratio when the player is off the ice.
XEV+/-: Expected even-strength plus-minus, which is an estimate of the plus-minus that an average player would post with the same teammates. The calculation is described above.
EV+/-: Even –strength plus-minus, which is simply plus-minus with estimated shorthanded goals removed and normalized to a 200 ESG environment.
AdjEV+/-: Adjusted even-strength plus-minus, which is even-strength plus-minus minus expected even-strength plus-minus. This is the final number.
The following three stats evaluate special teams play and are not related to adjusted plus-minus. I’m including them in the table for a quick reference to the player’s contributions outside of even-strength play.
PP% : The % of the team’s power play goals for that the player was on the ice for.
SH%: The % of the team’s power play goals against that the player was on the ice for.
$PPP/G: Power play points per game, normalized to a 70 PPG environment and with pre-1988 PP assists estimated.

Results
Here are the top 60 in career adjusted even-strength plus-minus, as well as the players in the HOH Top 100 and several others who were strongly considered for voting.

Rk Player SFrac $ESGF/G $ESGA/G R-ON R-OFF XEV+/- EV+/- AdjEV+/- /Season PP% $PPP/G SH%
1 Ray Bourque 20.30 1.17 0.85 1.37 0.96 -62 524 586 29 88% 0.45 58%
Sorry, but I'm still a bit confused.
The text says $ESGF and $ESGA are adjusted to 200 ESG scoring environment.
The $ prefix makes me think "adjusted to 200 ESG scoring environment".
The text says EV+/- is also adjusted to 200 ESG scoring environment. ?

If it had been named "$EV+/-", I would have been pretty sure it's $ESGF-$ESGA, because $ tells me that it's "adjusted to 200 ESG scoring environment".
My question is... Is EV+/- = $ESGF-$ESGA ?

The text also says:
Quote:
To calculate the adjusted plus-minus, I take the player’s on-ice total goals for and against as given. I calculate an expected plus-minus for the player, based on his team’s off-ice performance.
"As given", does that mean they are not "adjusted to 200 ESG scoring environment"?


Example, using made up seasonal stats for one player and his team.

Everything is ES only.
"without" or "w" means when player was off the ice.
GD (goal difference) = GF-GA.

Lge aver ESGF per team teamGD teamGF teamGA playerGD playerGF playerGA withoutGD withoutGF withoutGA pGF/pGA wGF/wGA  
100 +20 60 40 +10 24 14 +10 36 26 1.714 1.385 

Lge aver ESGF per team $teamGD $teamGF $teamGA $playerGD $playerGF $playerGA $withoutGD $withoutGF $withoutGA pGF/pGA wGF/wGA  
200 +40 120 80 +20 48 28 +20 72 52 1.714 1.385 

Are the $ values above true? Is that how the ""adjusted to 200 ESG scoring environment" values are calculated?

If the above is correct, then how do we use the different variables to calculate the missing columns?
R-On? R-Off?


Can someone write down the formula's for calculating the values, using the variables of the tables above?

And can we write "rOn" and "rOff" instead of "R-On" and "R-Off", to make it easier to understand the formulas? ("-" can otherwise be interpreted as a minus sign. But we don't take R minus On, and R minus Off, right? If, I'm even more confused.)





Quote:
The expected plus-minus is calculated using the off-ice performance regressed partially to even, as a player should be expected to play somewhat better than a set of bad teammates or worse than a set of good teammates.
I understand the regression thing. But I'm confused about other things, including what exactly to regress (I guess it's "wGF/wGA"). ? Is rOff = "wGF/wGA" regressed to even?

Quote:
I then calculate an actual plus-minus, which differs from official NHL plus-minus in that it is normalized to a scoring environment of 200 even-strength goal per season and does not include shorthanded goals. I subtract the “expected plus-minus” from the “actual plus-minus” to generate an adjusted plus-minus number.
I think I need an example to understand.

plusandminus is offline   Reply With Quote
Old
08-23-2011, 07:17 PM
  #191
overpass
Registered User
 
Join Date: Jun 2007
Posts: 3,618
vCash: 500
Quote:
Originally Posted by plusandminus View Post
Sorry, but I'm still a bit confused.
The text says $ESGF and $ESGA are adjusted to 200 ESG scoring environment.
The $ prefix makes me think "adjusted to 200 ESG scoring environment".
The text says EV+/- is also adjusted to 200 ESG scoring environment. ?

If it had been named "$EV+/-", I would have been pretty sure it's $ESGF-$ESGA, because $ tells me that it's "adjusted to 200 ESG scoring environment".
My question is... Is EV+/- = $ESGF-$ESGA ?


The text also says:


"As given", does that mean they are not "adjusted to 200 ESG scoring environment"?
All numbers are adjusted to the 200 ESG per team-season scoring environment. Yes, EV+/- is simply $ESGF-$ESGA. "As given" still means scoring-adjusted numbers.

Quote:
Originally Posted by plusandminus View Post
Example, using made up seasonal stats for one player and his team.

Everything is ES only.
"without" or "w" means when player was off the ice.
GD (goal difference) = GF-GA.

Lge aver ESGF per team teamGD teamGF teamGA playerGD playerGF playerGA withoutGD withoutGF withoutGA pGF/pGA wGF/wGA  
100 +20 60 40 +10 24 14 +10 36 26 1.714 1.385 

Lge aver ESGF per team $teamGD $teamGF $teamGA $playerGD $playerGF $playerGA $withoutGD $withoutGF $withoutGA pGF/pGA wGF/wGA  
200 +40 120 80 +20 48 28 +20 72 52 1.714 1.385 

Are the $ values above true? Is that how the ""adjusted to 200 ESG scoring environment" values are calculated?
All correct. I don't scoring-level adjust the off-ice numbers, since I only use them to calculate a ratio, but if you did want to adjust for scoring level that would be correct.

Quote:
Originally Posted by plusandminus View Post
If the above is correct, then how do we use the different variables to calculate the missing columns?
R-On? R-Off?

Can someone write down the formula's for calculating the values, using the variables of the tables above?

And can we write "rOn" and "rOff" instead of "R-On" and "R-Off", to make it easier to understand the formulas? ("-" can otherwise be interpreted as a minus sign. But we don't take R minus On, and R minus Off, right? If, I'm even more confused.)
R-ON = $ESGF/$ESGA. For single seasons, multiple seasons, or careers.

R-OFF = (TeamESGF-PlayerESGF)/(TeamESGA-PlayerESGA) for a single season. For multiple seasons, take the sum of the XEV+/-, $ESGF, and $ESGA, and calculate by turning around the formula for XEV+/-

XEV+/- = ($ESGF+$ESGA)/(1+R-OFF^0.65)*R-OFF^0.65 - ($ESGF+$ESGA)/(1+R-OFF^0.65)

EV+/- = $ESGF - $ESGA

AEV+/- = (EV+/-) - (XEV+/-)

Quote:
Originally Posted by plusandminus View Post
I understand the regression thing. But I'm confused about other things, including what exactly to regress (I guess it's "wGF/wGA"). ? Is rOff = "wGF/wGA" regressed to even?

I think I need an example to understand.
Adjusted plus-minus applies the regression to the ratio rOff in the XEV calculation. It's simply rOff^0.65, which regresses rOff toward 1. If I understand your terms correctly, wGF/wGA = rOff as I present it. I have never presented the regressed rOff value alone.

Full example:

Player X has 60 ESGF, 45 ESGA. His team has 140 ESGF, 155 ESGA. League scoring level is 150 ESGA/team.

$ESGF = 60*200/150 = 80
$ESGA = 45*200/150 = 60

rOn = 60/45 = 1.33
rOff = (140-60)/(155-45) = 80/110 = 0.73

XEV+/- = (80+60)*(0.73^0.65)/(1+0.73^0.65) - (80+60)/(1+0.73^0.65)
= 140*0.81/1.81 - 140/1.81
= 63 - 77
= -14

EV+/- = 80 - 60 = 20

AEV+/- = 20 - 14 = 34


Last edited by overpass: 08-24-2011 at 12:10 AM.
overpass is offline   Reply With Quote
Old
08-24-2011, 02:42 PM
  #192
plusandminus
Registered User
 
Join Date: Mar 2011
Posts: 980
vCash: 500
Quote:
Originally Posted by overpass View Post
(Formulas.)
OK, I post my results for 2002-03 season using overpass' formulas. I have not adjusted to era, since I only look at one season. I used .65 as the "regress to even" number. Let me know if numbers are wrong. (Please don't overfocus on the number of decimals shown. I want them there for verifying purposes.)

Lowest expected +/-:
TeamPosNameTOIshare+/-+/- without playerrOnrOffexp+/-adj+/-
PITFMARIO LEMIEUX 0.2856-15-490.76560.5702-20.40625.4062
CBJDLUKE RICHARDSON 0.4065-26-410.67090.6204-20.3193-5.6806
CBJDJAROSLAV SPACEK 0.3544-19-480.72060.5966-19.45540.4554
ATLFDANY HEATLEY 0.3079-1-500.98390.6296-18.355217.3552
PITFALEXEI KOVALEV 0.2415-4-600.91490.5420-17.683113.6831
PITDDICK TARNSTROM 0.2631-7-570.86000.5547-17.598410.5984
CBJFDAVID VYBORNY 0.285915-821.51720.4810-17.043232.0432
CBJFGEOFF SANDERSON 0.2760-1-660.97670.5417-16.716315.7163
PITFMARTIN STRAKA 0.2428-6-580.87230.5573-16.525010.5250
CARDSEAN HILL 0.34405-541.14710.5385-14.491719.4917
(Comment: Dominated by Pittsburgh and Columbus players.)

Highest expected +/-:
TeamPosNameTOIshare+/-+/- without playerrOnrOffexp+/-adj+/-
COLFJOE SAKIC 0.22699451.25001.569611.7839-2.7839
PHIDERIC WEINRICH 0.358015301.37501.468811.80733.1926
PHIDKIM JOHNSSON 0.360516291.39021.460311.99964.0004
DALDRICHARD MATVICHUK 0.2767-5570.86111.802812.6784-17.6784
COLDROB BLAKE 0.365219351.45241.479513.04085.9591
DALDDERIAN HATCHER 0.414127251.58701.409813.228913.7710
VANDBRENT SOPEL 0.3782-10350.82461.486113.3167-23.3168
COLDADAM FOOTE 0.384227271.55101.409113.874713.1252
DALDSERGEI ZUBOV 0.381917351.40481.538514.04872.9512
COLFSTEVEN REINPRECHT 0.2496-9630.81251.940318.4572-27.4572
COLDGREG DE VRIES 0.405116381.27591.666721.7152-5.7152
Comment: Some Colorado, some Philadelphia, some Dallas.


Best adjusted +/-:
TeamPosNameTOIshare+/-+/- without playerrOnrOffexp+/-adj+/-
COLFPETER FORSBERG 0.288555-12.71880.9880-0.468855.4688
COLFMILAN HEJDUK 0.32234952.48481.06102.211946.7881
DETDNICKLAS LIDSTROM 0.38783611.75001.01120.479335.5207
LA FZIGMUND PALFFY 0.323524-281.63160.7228-10.512534.5125
PHOFLADISLAV NAGY 0.300225-261.89290.7593-7.230932.2309
COLFALEX TANGUAY 0.310438162.18751.19285.837232.1627
CBJFDAVID VYBORNY 0.285915-821.51720.4810-17.043232.0432
DALFJERE LEHTINEN 0.280333192.43481.22625.227827.7722
STLFERIC BOGUNIECKI 0.245324-101.88890.9083-2.438626.4385
BOSFMIKE KNUBLE 0.281122-111.51160.9027-3.593425.5934
PHOFDAYMOND LANGKOW 0.322519-201.55880.8039-6.160825.1607
MTLDANDREI MARKOV 0.340818-261.48650.7869-7.151825.1517

Comment: Similar to the method I used and posted yesterday (post #184). Forsberg atop here too. Hejduk higher, as is Tanguay, which I think is not so good. Lidstrom higher. Vyborny lower, I think he should be higher, as I think +15 with vs -82 "without" is a huge difference. Palffy, Lehtinen, Boguniecki on my list too.
Martin St Louis, who was high on my list(s), is missing here. He's 25th here. Perhaps it has to do with how he distributed his ESGF and ESGF game by game.


Best adjusted (if not "regressing R-Off to even"):
TeamPosNameTOIshare+/-+/- without playerrOnrOffexp+/-adj+/-
COLFPETER FORSBERG 0.288555-12.71880.9880-0.721255.7212
COLFMILAN HEJDUK 0.32234952.48481.06103.402345.5976
CBJFDAVID VYBORNY 0.285915-821.51720.4810-25.5812040.5812
LA FZIGMUND PALFFY 0.323524-281.63160.7228-16.091940.0919
PHOFLADISLAV NAGY 0.300225-261.89290.7593-11.084236.0842
DETDNICKLAS LIDSTROM 0.38783611.75001.01120.737435.26257
COLFALEX TANGUAY 0.310438162.18751.19288.967029.0330
MTLDANDREI MARKOV 0.340818-261.48650.7869-10.972428.9724
NYIDROMAN HAMRLIK 0.354816-151.27120.8256-12.802528.8025
PHOFDAYMOND LANGKOW 0.322519-201.55880.8039-9.456528.4565
STLFERIC BOGUNIECKI 0.245324-101.88890.9083-3.750027.7500
BOSFMIKE KNUBLE 0.281122-111.51160.9027-5.525627.5255
ATLFDANY HEATLEY 0.3079-1-500.98390.6296-27.954526.9545
CARDSEAN HILL 0.34405-541.14710.5385-21.900026.9000
TB DDAN BOYLE 0.360013-221.28890.7755-13.022926.0229

Comment: At first glance it looks "better", in that linemates appear a bit more separated. But experience has shown me that regressing "when player off the ice" to even usually gives better results. (By the way, 6 Europeans atop.)



Although I like to compare the results to "my" mentioned method (see post #184), my method is not perfect. I need to find a way to include games where the player played but wasn't on the ice on any ES goals either way. Plus think more about it. Good things with it are that it doesn't need to pay attention to ice time, and doesn't have to adjust to "different GPG in different eras".

Now that I may know how overpass have done the calculations, and have been able to (hopefully) reproduce the results, my impression is that overpass' technique gives good results considered how relatively simple it is.
By simple, I mean that it only depends on ESGF and ESGA (and adjustment for ESGF per season), and that basically the only "tricky" part was the formula to calculate the expected +/-. That formula seem to produce interestingly good results.
(I did however not understand how it was done until it was written down in detail.)

The things I "don't like" about it, may be mostly present when looking at single seasons. When aggregating seasons, it might become alright (as players will play on other lines, on teams differently strong, etc). That is also what overpass have said.

Thanks overpass for explaining more in detail. It currently seems as if most of my "suspicions" regarding your method may have been a bit overstated.

Hopefully I will be able to reproduce czechyourmath's method too... eventually.


Last edited by plusandminus: 08-26-2011 at 07:55 PM.
plusandminus is offline   Reply With Quote
Old
08-24-2011, 05:28 PM
  #193
plusandminus
Registered User
 
Join Date: Mar 2011
Posts: 980
vCash: 500
Quote:
Originally Posted by Czech Your Math View Post
I have an alternative that might be fairer to players on great teams without using somewhat arbitrary regressions to the mean. It's not exactly comparable to adjusted plus-minus, but it uses much of the same methodology. For lack of a better term, I might call it "even strength value". It has two primary components:

1. Player's share of team success at even strength
2. Player's marginal (additional) success at even strength

Once you calculate each component, simply add them together.

Player's share of team success at ES is calculated as:

82 * (Team Exp. ES Win %) * (Player's ESGF + ESGA) / (Team's ESGF + ESGA)

where Team Expected ES Win % = (ESGF)^N / (ESGF^N + ESGA^N)

this is the pythagorean win formula; N = 2 (or another number if supported by data)

Player's marginal contribution to team success is calculated as follows:

Subtract the player's ESGF and ESGA from the team's totals.
Recalculate the ES Win % from the new numbers (this is ES Win % without player).
Subtract Team Exp. ES Win % from ES Win % without player.
Multiply the difference in Win % by 82 to yield player's marginal contribution.

Then add player's share of team success ES and player's marginal contribution to team success at ES to get "ES Value" (whatever is proper term). The results are in the same ballpark as plus-minus.
Is the above still true? Or did you come up with changes to improve further?

Anyway, I tried to calculate according to how I interpreted the instructions, and below are my results for the 2002-03 season.
Columns starting with "x" is "without" player, i.e. teamStat - playerStat. GS = goal sum (GF+GA).
I haven't multiplied anything by 82. Should be OK anyway, right?

You are welcome to check my math for errors and/or misunderstandings.

TeamPosNameGFGAGSxGSxGFxGA+/-x+/-playerWinxWinteamWinpmargContWin
COLFPETER FORSBERG 8732119109828355-10.2623310.2863980.6835060.4190130.681344
COLFMILAN HEJDUK 823311510387824950.2478910.2767710.6835060.4049280.652819
DALDDERIAN HATCHER 734611979866127250.2044170.3079200.6882920.4473680.651785
COLDADAM FOOTE 764912581936627270.1949430.3008380.6835060.4401390.635082
COLDGREG DE VRIES 745813270955716380.1684690.3176850.6835060.4647870.633256
COLFALEX TANGUAY 703210292998338160.2214170.2454840.6835060.3591540.580571
WASDSERGEI GONCHAR 806314332687017-20.0630010.2815360.5532290.5088950.571896
DETDNICKLAS LIDSTROM 84481327390893610.1448990.2620090.6173100.4244360.569335
DALDPHILIPPE BOUCHER 60389874996922300.1914790.2535810.6882920.3684200.559899
DALDSERGEI ZUBOV 5942101691006517350.1785410.2613430.6882920.3796970.558238
PHIDKIM JOHNSSON 57419861926316290.1621220.2604590.6724110.3873500.549472
COLDROB BLAKE 6142103731087319350.1756890.2478910.6835060.3626750.538364
PHIDERIC WEINRICH 55409560946415300.1594650.2524860.6724110.3754930.534958
DALFMIKE MODANO 553085771047725270.1992420.2199420.6882920.3195470.518789
DALFJERE LEHTINEN 562379851038433190.2199420.2044170.6882920.2969910.516933
PHIDERIC DESJARDINS 54298370957525200.1860420.2205930.6724110.3280620.514104
OTTDWADE REDDEN 6244106601017718240.1362080.2406350.6447220.3732380.509446
BOSFGLEN MURRAY 836514829849118-70.0479450.2446880.5340160.4582030.506148
DETDMATHIEU DANDENAULT 7052122551048518190.1091700.2421600.6173100.3922820.501452
COLDDEREK MORRIS 553489751148121330.1805030.2141970.6835060.3133790.493882

Forsberg atop here too. But I think there are far too much Colorado dominance at the top. Basically, it seems to list the players with highest ESGF+ESGA on the teams.
We see some familiar names from the other two methods, like Hejduk, Lehtinen, Tanguay, but also many new.

Have I missed something in my calculations?

While I think "my" method and overpass' method ended up with quite similar results, I think this method gives the most "different" results. That does not necessarily have to bad, but looking at the table it does not seem to care much about "how good" the player played. Guys like Foote and DeVries don't look special +/- wise when comparing them to how Colorado did when they were off the ice.
The list is very much dominated by defencemen.
The only forwards on the list are: Forsberg-Hejduk-Tanguay, Modano-Lethinen and G.Murray. Among forwards will soon follow Bertuzzi-Naslund-Morrison (in between them are a few other forwards), all close to each other.

Shoudn't there be some consideration paid to GF-GA, or GF/(GF+GA), or even GF/GA?
Maybe I've missed something?

Edit: By the way, some guys ended up with slightly negative numbers. Is that OK?
Worst:
TeamPosNameGFGAGSxGSxGFxGA+/-x+/-playerWinxWinteamWinpmargContWin
CBJFKENT MCDONELL 011-68120186-1-66-0.0646060.0009500.2916810.003256-0.061350
CBJFMATHIEU DARCHE 011-68120186-1-66-0.0646060.0009500.2916810.003256-0.061350


Edit:

I experimented a bit more.
teamWin = team win formula
xWin = appplying win formula but with "without" stats instead of team stats. ("Without"=team-player.)
Then the differences between the two.
playerWin = applying win formula but with player stats instead of team stats. Gives strange results for players with low numbers.
The results below looks far "better" than the ones above.
One thing I suspect is still missing, is to add something more to it. I think we know below much "difference" the player did, but I think there might be something more added? (Perhaps something to do with (playerGF+playerGA) / (teamGF+teamGA)?? I'm very tired now, by will continue probably tomorrow.

TeamPosNameGFGAGSxGSxGFxGA+/-x+/-teamWinxWinDiffDiff%playerWin
COLFPETER FORSBERG 8732119109828355-10.6835060.4939390.1895671.3837860.880833
COLFMILAN HEJDUK 823311510387824950.6835060.5295590.1539471.2907070.860616
LA FZIGMUND PALFFY 6238100207310124-280.4854040.3431420.1422621.4145860.726928
PHOFLADISLAV NAGY 532881248210825-260.4963100.3656730.1306371.3572500.781797
DETDNICKLAS LIDSTROM 84481327390893610.6173100.5055860.1117241.2209790.753846
CBJFDAVID VYBORNY 442973-527615815-820.2916810.1878980.1037831.5523360.697155
PHOFDAYMOND LANGKOW 533487188210219-200.4963100.3925730.1037371.2642480.708448
NASDJASON YORK 4935844729614-240.4603790.3600000.1003791.2788300.662162
NYIDROMAN HAMRLIK 755913417718616-150.5034360.4053220.0981141.2420640.617724
STLFERIC BOGUNIECKI 512778389910924-100.5488340.4520330.0968011.2141450.781081
COLFALEX TANGUAY 703210292998338160.6835060.5872370.0962691.1639350.827143
TB DDAN BOYLE 58451034769813-220.4675430.3755520.0919911.2449480.624234
MTLDANDREI MARKOV 553792109612218-260.4742100.3824060.0918041.2400690.688438
MINFPASCAL DUPUIS 46307617809516-150.5039840.4149100.0890741.2146820.701591
CARDSEAN HILL 393473-44631175-540.3133260.2247700.0885561.3939840.568173
LA FALEXANDER FROLOV 493382128610616-200.4854040.3969510.0884531.2228310.687965
DALFJERE LEHTINEN 562379851038433190.6882920.6005660.0877261.1460720.855661
BOSFMIKE KNUBLE 65431083310211322-110.5340160.4489700.0850461.1894240.695587
TB FMARTIN ST. LOUIS 57461032779711-200.4675430.3865560.0809871.2095090.605591
STLDAL MACINNIS 695011933818619-50.5488340.4700860.0787481.1675180.655694

Results looks much more similar to the other methods (those "by overpass" and "by me").

Dividing instead gives different results, see below. But those above are "better", right? ?

TeamPosNameGFGAGSxGSxGFxGA+/-x+/-teamWinxWinDiffDiff%playerWin
CBJFDAVID VYBORNY 442973-527615815-820.2916810.1878980.1037831.5523360.697155
LA FZIGMUND PALFFY 6238100207310124-280.4854040.3431420.1422621.4145860.726928
CARDSEAN HILL 393473-44631175-540.3133260.2247700.0885561.3939840.568173
COLFPETER FORSBERG 8732119109828355-10.6835060.4939390.1895671.3837860.880833
PHOFLADISLAV NAGY 532881248210825-260.4963100.3656730.1306371.3572500.781797
COLFMILAN HEJDUK 823311510387824950.6835060.5295590.1539471.2907070.860616
CBJFGEOFF SANDERSON 424385-6878144-1-660.2916810.2268450.0648361.2858160.488236
PITFALEXEI KOVALEV 434790-6871131-4-600.2908680.2270510.0638171.2810690.455643
NASDJASON YORK 4935844729614-240.4603790.3600000.1003791.2788300.662162
FLADANDREAS LILJA 362965-31751207-450.3569020.2808980.0760041.2705750.606457
PHOFDAYMOND LANGKOW 533487188210219-200.4963100.3925730.1037371.2642480.708448
ATLFDANY HEATLEY 6162123-5285135-1-500.3545280.2838890.0706391.2488260.491870


Last edited by plusandminus: 08-24-2011 at 06:47 PM. Reason: adding more text
plusandminus is offline   Reply With Quote
Old
08-24-2011, 06:50 PM
  #194
plusandminus
Registered User
 
Join Date: Mar 2011
Posts: 980
vCash: 500
I added text to the previous, perhaps confused, post. Not sure if it became less or more confused. Very tired now.

plusandminus is offline   Reply With Quote
Old
08-21-2012, 02:07 AM
  #195
OrrNumber4
Registered User
 
OrrNumber4's Avatar
 
Join Date: Jul 2002
Country: Switzerland
Posts: 7,465
vCash: 500
This is just a fantastic thread. Thought I would bump it up so other's could see it. Should be pinned.

OrrNumber4 is offline   Reply With Quote
Old
10-01-2012, 10:07 AM
  #196
Sixbladeknife
Registered User
 
Join Date: Oct 2011
Country: Finland
Posts: 36
vCash: 500
This is amazing stuff, very insightful! Do you have / are you willing to share the year-to-year spreadsheets? A friend of mine and I are trying to rank the best players since 1990, and this information would be very useful.


Last edited by Sixbladeknife: 10-01-2012 at 10:37 AM. Reason: Typo
Sixbladeknife is offline   Reply With Quote
Old
10-11-2012, 08:16 PM
  #197
overpass
Registered User
 
Join Date: Jun 2007
Posts: 3,618
vCash: 500
Quote:
Originally Posted by Sixbladeknife View Post
This is amazing stuff, very insightful! Do you have / are you willing to share the year-to-year spreadsheets? A friend of mine and I are trying to rank the best players since 1990, and this information would be very useful.
Just posted it here.

http://hfboards.hockeysfuture.com/sh....php?t=1270041

I don't fully endorse the single season plus-minus ratings as significant. There's a lot of random variation still at the season level. But I do find it useful for looking at groups of seasons like, say, Gretzky's Edmonton years compared to his LA years. Or for looking peak seasons (over several years) for any player.

overpass is offline   Reply With Quote
Reply

Forum Jump


Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -5. The time now is 08:38 AM.

monitoring_string = "e4251c93e2ba248d29da988d93bf5144"
Contact Us - HFBoards - Archive - Privacy Statement - Terms of Use - Advertise - Top - AdChoices

vBulletin Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
HFBoards.com is a property of CraveOnline Media, LLC, an Evolve Media, LLC company. ©2014 All Rights Reserved.