By The NumbersHockey Analytics... the Final Frontier. Explore strange new worlds, to seek out new algorithms, to boldly go where no one has gone before.
I've edited the OP to add the most recent adjusted plus-minus numbers I have, using my current method of calculating, for anyone who is interested.
Generally, the updated numbers are slightly more positive towards players on good teams, since the team adjustment is a little weaker.
I've also changed the SHGF estimator to include actual SH points, so players who scored a lot of SH points have had those moved out of the ESGF column and into the SHGF column. This change is small to non-existent for most forwards, and probably only affects Paul Coffey and Mark Howe among defencemen. It has a major effect on Wayne Gretzky's numbers.
matnor, I see what you mean about the Langway numbers being off. I don't have the 2008 numbers anymore to replace them with, so I just deleted him from the first table and included him in the second table.
plusminus, I'm still planning to get to what you posted. Not ignoring you, just busy
I've studied correlations between ES icetime and ESGF+ESGA and they don't necessarily correspond very well, perhaps especially regarding forwards. For example, during 2002-03, I guy like Kowalchuk (who had a reputation as an offensive minded player) had about 1.5 times more ESGF+ESGA played than the average player.
So here we thus have our first bias. ESGF=60 and ESGA=50, will give higher "ice time" than ESGF=50 and ESGA=40, even if real ice time is the same. ?
I'm not sure ice time is so important. If a player is on ice for a much higher % of ES GF+GA than his % of his ice time, that's okay. If he's able to perform at or above the GF/GA ratio of the team as a whole, then the more volume the better for the team. If he's at a worse GF/GA level, then the high volume will negatively effect the player portion of the metric even more.
Quote:
Originally Posted by plusandminus
I'm a bit against dividing ESGF by ESGA. As i wrote in another reply some day ago, I think there are better ways to do it. To use the above example, I would say that 60-50 and 50-40 is equally good, despite the latter one getting a slightly higher ratio (if I understand you right).
I don't know why you are so against calculating GF/GA ratios. They shouldn't be used randomly, but in this case they are the primary basis of the pythagorean win% calculation, so of great importance.
Whether 60/50 or 50/40 is better may depend most on context (all other things being equal). On a bad team, the extra 10/10 might be helpful, while on a good team, it may be hurtful.
Quote:
Originally Posted by plusandminus
I googled, and according to wikipedia I got the impression that rather 1.8 was the "right exponent", at least in baseball?
Wouldn't shootout goals be excluded from the stats?
I saw more than one study for hockey. If 1.8 or whatever number is deemed a solid number, I have not attachment to 2.0 as exponent. It does vary by sport though, I think mainly due to differing scoring levels.
Quote:
Originally Posted by plusandminus
That's 94-38 = +56 with Forsberg. And 84-87 = -3 without.
Not only did he have the league's by far best ES+/-, and scored the highest amount of ESpts, on a team that without him (and the guys who were on the ice with him) had negative +/-.
Adding ESGF+ESGA, we get 94+38=132 for him. 178+125=303 for the team.
That's an "ice time" of 43.56 % according to ESGF+ESGA.
I got 43.56 %, so at least one of us (perhaps I) may be wrong.
In reality, the correct answer seems to be 28.84 %. (if my data is correct)
So, the estimated percentage is about 1.5 times higher.
You are correct. I have realized yet another error in this hastily put together study. I had calculated player's % of team's GF and GA separately and then summed them, instead of summing them before calculating the player's % of team. I came up with this idea a couple days ago and used some existing player data for the basis of most of the calculations, but that doesn't excuse my sloppiness.
Nice job of checking my math!
Quote:
Originally Posted by plusandminus
And I think three players with identical EStime, having ESGF-ESGA of 40-40 and 30-30 and 20-20 contributes equally much.
Quote:
Originally Posted by plusandminus
It would be interesting to this stat listed for all the players on a team.
(I can do it myself, but not right now.)
With the risk of being called an idiot, does the sum of all players equal 100??
I don't have calculations for an entire team. I think the totals would not exactly balance, but should not be way off either.
The total for all players on the team should be somewhere around:
5 * 82 * (team's ES pythagorean win%)
note: 5 is number of skaters per goal
For a team with equal ES GF and GA, should be ~205 "ES wins"
Quote:
Originally Posted by plusandminus
Pittsburgh were a bit special that year, starting the season with some very good players, just to see them drop off one by one. So Mario's stats sank deeper and deeper during the season. (If I remember right.)
Yeah, Pittsburgh was real "special" for a few years there post-Jagr.
In '94 and '96, Lemieux's R-On was slightly less than R-Off, although a big reason for that is Jagr being such a large part of the R-Off. His R-On/R-Off is great in '97 (1.97) and very good in 2001 (1.39), but a lot of that was due to those being the 1.5 seasons he played with Jagr at even strength. After that, he was mostly weak.
Quote:
Originally Posted by plusandminus
A thought I have, is that one might want to seperate forwards and defencemen, since they may not be easily comparable.
An additional way of improving (or not) the method, could be to include ESpts in the calculations, to estimate how much different players contributed to their ESGF. Now I'm mainly thinking of doing it for forwards, to help seperate the offensive contributions of linemates, although it might be useful to apply (perhaps in a differnt form) to defencemen as well. To do something similar for ESGA would of course be basically impossible (unless one apply an assumption like "defencemen being more responsible for ESGA, while forwards being more responsible for ESGF").
Yes, separating forwards and defensemen is one possibilty. Would rather not do that... and what about players like Coffey that could almost be classified as either? I like using a player's ES points as a % of ESGF, but this requires even more data and doesn't address ESGA. I think the latter is probably the better way to go, or it may be better to just live with a "pure" but flawed metric.
Quote:
Originally Posted by plusandminus
Finally, which you likely are aware of, this stat only tells us about players' contributions during ES. So the "rankings" here are ES only.
Creating similar stats for PP and SH would rank players differently.
For example. While Forsberg had "much better" ES stats than Naslund in 2002-03, Naslund had better PP stats.
Of course this, like adjusted plus-minus, isn't an all-encompassing metric. It's meant to shed light on even strength value. Still, about ~75% of goals occur at even strength and it's even strength play that leads to penalties, so ES play is crucial to overall value.
It's not meant to measure all aspects of the game.
I don't know why you are so against calculating GF/GA ratios. They shouldn't be used randomly, but in this case they are the primary basis of the pythagorean win% calculation, so of great importance.
I'll try to explain.
If I was asked to rank the following seasonal ES stats for players, without paying any attention at all to context, I would rank them as follow (with a tie for 2nd best):
GF-GA
GD
GF/GA
GF+GA
(GF/GA)*(GF+GA)
72-50
+22
1.440
112
161.28
60-40
+20
1.500
100
150
40-20
+20
2.000
60
120
45-30
+15
1.500
75
112.5
7- 4
+ 3
1.750
11
19.25
3- 1
+ 2
3.000
4
12
GD=GF-GA (goal difference). GS=GF+GA (goal sum).
Comments:
1. The guy with a GF/GA of 3.000 looks far too good compared to the others.
2. The lower numbers, the more extreme GF/GA. (It's a bit like pts per game. The fewer games played, the more extreme points per game.)
That's why I generally think one should be careful with using GF/GA.
Also:
3. Player ice time share during ES vary a lot between players. So does the amount time the player was not on the ice. No matter if one use real ice times, or take GF+GA, the differences are big.
4. Thus, when comparing "with" and "without", we would be comparing for example a GF/GA based on very low numbers, with a GF/GA based on very high numbers.
I'm not convinced yet regarding how good the win formula, and other formulas are at handling the things I mentioned above. Maybe they are great.
Quote:
Originally Posted by Czech Your Math
I'm not sure ice time is so important. If a player is on ice for a much higher % of ES GF+GA than his % of his ice time, that's okay. If he's able to perform at or above the GF/GA ratio of the team as a whole, then the more volume the better for the team. If he's at a worse GF/GA level, then the high volume will negatively effect the player portion of the metric even more.
The above seems based a lot on GF/GA, and I think GF/GA can "lie".
Let's say we have a 2-3 result without player on ice (GF/GA=0.400).
Player on ice doing 5-3 will make his team win 7-6, despite GF/GA=1.667.
Player on ice doing 2-1will only make his team draw 4-4, despite GF/GA of 2.00.
As I said, maybe the win formula and other formulas have methods to guard for such contradictions.
Quote:
Whether 60/50 or 50/40 is better may depend most on context (all other things being equal). On a bad team, the extra 10/10 might be helpful, while on a good team, it may be hurtful.
By themselves, I think both are equal. Context may make one look better, but I'm not sure GF/GA is the best way to determine that.
I may be wrong.
If I was asked to rank the following seasonal ES stats for players, without paying any attention at all to context, I would rank them as follow (with a tie for 2nd best):
GF-GA
GD
GF/GA
GF+GA
(GF/GA)*(GF+GA)
72-50
+22
1.440
112
161.28
60-40
+20
1.500
100
150
40-20
+20
2.000
60
120
45-30
+15
1.500
75
112.5
7- 4
+ 3
1.750
11
19.25
3- 1
+ 2
3.000
4
12
GD=GF-GA (goal difference). GS=GF+GA (goal sum).
Comments:
1. The guy with a GF/GA of 3.000 looks far too good compared to the others.
2. The lower numbers, the more extreme GF/GA. (It's a bit like pts per game. The fewer games played, the more extreme points per game.)
That's why I generally think one should be careful with using GF/GA.
The metric I've been tinkering with does not use any formula akin to (GF/GA)*(GF+GA). Neither does Overpass' adjusted plus-minus. Also, I agree that small sample sizes tend to lead to skewed results. That is why taking the best X seasons or career numbers are going to be more reliable for almost any metric.
In your 60/40 vs. 40/20 example, context is very important. First, it tells you in what environment the data was created. Second, it tells you what impact the player's performance is going to have. Since one player has 20/20 more than the other, if the R-Off of his team was > 1.0, then his performance did not help his team, while if it was < 1.0 it did help his team.
Quote:
Originally Posted by plusandminus
Also:
3. Player ice time share during ES vary a lot between players. So does the amount time the player was not on the ice. No matter if one use real ice times, or take GF+GA, the differences are big.
4. Thus, when comparing "with" and "without", we would be comparing for example a GF/GA based on very low numbers, with a GF/GA based on very high numbers.
I'm not convinced yet regarding how good the win formula, and other formulas are at handling the things I mentioned above. Maybe they are great.
Again, that's why need multiple seasons to have any real solid data.
Maybe I should focus more or solely on the player's portion of the formula. I came up with the distribution of team's ES wins while thinking of some way to address Overpass' concern that players on great teams are hampered by the team's strong R-OFF. Honestly, getting credit for "just showing up" is not that great, although it's actually "playing a lot for great teams", and usually it's very good players who get lots of ice time over many years on great teams.
The player's portion is calculated by deducting his ES goals for/against from the team totals. The better the ratio and the more goals he was on ice for, the more impact it will have on the estimated win% differential, but it's more complex than multiplying the GF/GA ratio by the sum of ES GF + GA.
Quote:
Originally Posted by plusandminus
The above seems based a lot on GF/GA, and I think GF/GA can "lie".
Let's say we have a 2-3 result without player on ice (GF/GA=0.400).
Player on ice doing 5-3 will make his team win 7-6, despite GF/GA=1.667.
Player on ice doing 2-1will only make his team draw 4-4, despite GF/GA of 2.00.
As I said, maybe the win formula and other formulas have methods to guard for such contradictions.
All stats can "lie", 76% of statisticians can attest to that.
I'm not using GF/GA ratio as an absolute metric. In referring to Overpass' adjusted plus-minus, I do think R-ON/R-OFF is a valuable metric. It tells you in % terms how much more effective the team was with that player on the ice than without him on the ice, and that's a valuable piece of information.
Quote:
Originally Posted by plusandminus
By themselves, I think both are equal. Context may make one look better, but I'm not sure GF/GA is the best way to determine that.
I may be wrong.
They are similar, but not equal in most cases. An extra 10 GF and 10 GA is outstanding on the '75 Capitals and rather weak on a dynasty team.
Notice that the first list has 9 of the top 10 with an R-OFF below 1. On the second list, 9 of the top 10 have an R-OFF above 1.
Thanks for posting those, I like seeing the "pure" list.
On second list, six players in top 50 with R-OFF < 1:
Bourque, Jagr, M. Howe, Lindros, Thornton, Selanne
Rather underrated group for the most part.
While Dionne, Forsberg and Lindros all helped linemates make the list, it's especially impressive to see Jagr help two separate centers (Francis, Nylander) make the list.
Perhaps I should stay with a "pure" list and just use estimated win% (player portion of ES value)? I think that would look similar to your second (100% adjusted) list.
Last edited by Czech Your Math: 08-22-2011 at 01:52 PM.
It seems I got a bit misunderstood when I included a (GF/GA)*(GF+GA) column. That column was just there as an example of what results it would show. It was the other columns that were the important ones. Below I have deleted that column.
GF-GA
GD
GF/GA
GF+GA
72-50
+22
1.440
112
60-40
+20
1.500
100
40-20
+20
2.000
60
45-30
+15
1.500
75
7- 4
+ 3
1.750
11
3- 1
+ 2
3.000
4
GD=GF-GA (goal difference). GS=GF+GA (goal sum).
Quote:
Originally Posted by Czech Your Math
Also, I agree that small sample sizes tend to lead to skewed results. That is why taking the best X seasons or career numbers are going to be more reliable for almost any metric.
Yes. But has it been tested out how much more reliable?
I may try to examine that a bit more.
I also intend to look at game by game to see just what ES result the player had in the game ("with"), and what ES result the team had with him off ice ("without").
I know this can only be done for recent seasons, unless one wants to rely on estimated ES stats, but I still think it would be interesting to see what results it will produce. I'll get GP W D L GF-GA Pts for the players ("with") and for "without" them, and can then compare the two.
Quote:
In your 60/40 vs. 40/20 example, context is very important. First, it tells you in what environment the data was created. Second, it tells you what impact the player's performance is going to have. Since one player has 20/20 more than the other, if the R-Off of his team was > 1.0, then his performance did not help his team, while if it was < 1.0 it did help his team.
I understand your example here. But I so far don't like GF/GA to be used, for reasons I have tried to put forward.
Quote:
The player's portion is calculated by deducting his ES goals for/against from the team totals. The better the ratio and the more goals he was on ice for, the more impact it will have on the estimated win% differential
I understand that. That's what I call "with" and "without". And "with" may be unproportional compared to "without".
I will experiment a bit on my own to see how much it may affect the results.
Quote:
All stats can "lie", 76% of statisticians can attest to that.
That was a funny statement.
Quote:
They are similar, but not equal in most cases. An extra 10 GF and 10 GA is outstanding on the '75 Capitals and rather weak on a dynasty team.
IYes. But has it been tested out how much more reliable?
I may try to examine that a bit more.
The "pure" part is the player portion. Taking the differential of the estimated pythagorean win% based on performance. What part of that do you disagree with? I understand the exponent is a bit difficult to pinpoint, but I don't think it makes much difference when comparing players, since all would be affected similarly.
The assignment of some portion of the team's success based is somewhat arbitrary and perhaps not necessary at all.
Quote:
Originally Posted by plusandminus
I also intend to look at game by game to see just what ES result the player had in the game ("with"), and what ES result the team had with him off ice ("without").
I know this can only be done for recent seasons, unless one wants to rely on estimated ES stats, but I still think it would be interesting to see what results it will produce. I'll get GP W D L GF-GA Pts for the players ("with") and for "without" them, and can then compare the two.
Are you talking about ES data in games the player did not play? If you have that data, it would be interesting. However, if that's what you mean, why not just look at total results (record, GF, GA). I've calculated the actual vs. expected win% for a few players, and could also calculated expected win % using pythagorean based on GF/GA.
If you're bringing ice time into the picture, I don't consider that very important in comparison.
Quote:
Originally Posted by plusandminus
I understand that. That's what I call "with" and "without". And "with" may be unproportional compared to "without".
I will experiment a bit on my own to see how much it may affect the results.
With and without, yes it's a fairly simple concept.
I don't understand what you mean by unproportional.
Below is an example to illustrate the principle of looking at every single game to determine ES contributions. Everything in the table is ES only. 2002-03 season.
x=without player. tot (or t)=with player+without player
W D L = won draw loss
Pts=Pts, win=2, draw=1, loss=0
I need to keep table names short. Ask if unclear.
I think only games where the player were on the ice on a goal (no matter what type of goal) are counted. The missing games are all 0-0 for the player.
(Better would be to include all the games the player participated in, but unfortunately that informations seems to be depleted because of a disk crash several years ago.)
Just an example.
Team
Pos
Name
Game
+/-
x+/-
Pts
xPts
totPts
Pts-x
tot-x
W
D
L
xW
xD
xL
tW
tD
tL
COL
C
PETER FORSBERG
68
55
1
97
61
91
36
30
40
17
11
21
19
28
39
13
16
COL
R
MILAN HEJDUK
73
49
-1
98
72
94
26
22
37
24
12
27
18
28
41
12
20
LA
R
ZIGMUND PALFFY
71
24
-26
86
53
71
33
18
30
26
15
17
19
35
25
21
25
DAL
R
JERE LEHTINEN
64
33
21
89
72
86
17
14
34
21
9
27
18
19
34
18
12
DAL
D
DERIAN HATCHER
75
27
27
90
85
99
5
14
36
18
21
32
21
22
40
19
16
TB
R
MARTIN ST. LOUIS
70
7
-19
78
54
67
24
13
29
20
21
16
22
32
25
17
28
COL
L
ALEX TANGUAY
69
38
10
88
76
89
12
13
35
18
16
31
14
24
38
13
18
DET
D
NICKLAS LIDSTROM
80
36
2
95
84
97
11
13
37
21
22
31
22
27
40
17
23
NJ
R
JAMIE LANGENBRUNNER
59
16
14
66
62
75
4
13
25
16
18
22
18
19
32
11
16
How to read the table:
Forsberg was +55 during ES, without him Colorado was +1 during ES. Forsberg was 40-11-17 (W-L-D) during ES. That would have resulted in 97 pts in 68 games. It seems he (and his units) helped getting Colorado 30 more "ES points" (91 instead of 61). Again, everything is ES only, and only games where player where on the ice on a goal is counted.
Sorted another way, it would look like:
Team
Pos
Name
Game
+/-
x+/-
Pts
xPts
totPts
Pts-x
tot-x
W
D
L
xW
xD
xL
totW
totD
totL
CBJ
R
DAVID VYBORNY
59
15
-63
71
32
37
39
5
24
23
12
9
14
36
14
9
36
COL
C
PETER FORSBERG
68
55
1
97
61
91
36
30
40
17
11
21
19
28
39
13
16
LA
R
ZIGMUND PALFFY
71
24
-26
86
53
71
33
18
30
26
15
17
19
35
25
21
25
CAR
D
SEAN HILL
66
5
-44
69
42
52
27
10
22
25
19
11
20
35
20
12
34
COL
R
MILAN HEJDUK
73
49
-1
98
72
94
26
22
37
24
12
27
18
28
41
12
20
MTL
D
ANDREI MARKOV
71
18
-23
88
62
66
26
4
32
24
15
20
22
29
25
16
30
TB
R
MARTIN ST. LOUIS
70
7
-19
78
54
67
24
13
29
20
21
16
22
32
25
17
28
CBJ
L
GEOFF SANDERSON
71
-1
-56
70
47
42
23
-5
20
30
21
13
21
37
14
14
43
LA
L
ALEXANDER FROLOV
58
16
-16
69
48
59
21
11
28
13
17
19
10
29
22
15
21
TB
C
VACLAV PROSPAL
70
7
-21
81
60
67
21
7
27
27
16
16
28
26
25
17
28
The difference between Vyborny on the ice, and Vyborny off the ice, is hugh. +15 with, -63 without. With him, 71 points in 59 games, without him only 32 points in 59 games. His contributions however only helped Columbus get 5 more points more than if he had been +/- 0 in every game. Same data and limitations as above.
Just an example. Just intended as something to add to the debate.
I know important stats are missing for older data. Again, just an example.
Pts, xPts and totPts can be used to tell us about pts per game:
Team
Pos
Name
Game
ES+/-
xES+/-
Pts
xPts
totPts
Pts-xPts
totPts-xPts
ATL
C
KAMIL PIROS
2
4
-2
2.000
0.000
1.500
2.000
1.500
WAS
D
MICHAEL FARRELL
1
1
-1
2.000
0.000
1.000
2.000
1.000
NAS
D
TOMAS KLOUCEK
1
1
-1
2.000
0.000
1.000
2.000
1.000
VAN
R
PAT KAVANAGH
2
2
-2
2.000
0.000
1.000
2.000
1.000
I personally find totals more useful.
Results does seem more useful if for example only taking those with minimum of 41 games:
Team
Pos
Name
Game
ES+/-
xES+/-
Pts
xPts
totPts
Pts-xPts
totPts-xPts
COL
C
PETER FORSBERG
68
55
1
1.426
0.897
1.338
0.529
0.441
COL
R
MILAN HEJDUK
73
49
-1
1.342
0.986
1.288
0.356
0.301
LA
R
ZIGMUND PALFFY
71
24
-26
1.211
0.746
1.000
0.465
0.254
NJ
R
JAMIE LANGENBRUNNER
59
16
14
1.119
1.051
1.271
0.068
0.220
DAL
R
JERE LEHTINEN
64
33
21
1.391
1.125
1.344
0.266
0.219
PHO
L
LADISLAV NAGY
62
24
-10
1.210
0.952
1.145
0.258
0.194
LA
L
ALEXANDER FROLOV
58
16
-16
1.190
0.828
1.017
0.362
0.190
COL
L
ALEX TANGUAY
69
38
10
1.275
1.101
1.290
0.174
0.188
DAL
D
DERIAN HATCHER
75
27
27
1.200
1.133
1.320
0.067
0.187
STL
R
ERIC BOGUNIECKI
43
11
-10
1.186
0.721
0.907
0.465
0.186
I prefer the totals (first two tables in this post) more than the averages.
Perhaps, though, "Pts as a total" might be wisely combined with "xPts per game".
Dividing, or dividing with square root, would give:
I'm not saying this last table above gives the best results, just that it uses another way of combining "with" and "without".
Maybe the results of the "win formula" and/or overpass' method would be fairly similar (or not).
I think this is an interesting way to look at things. I know one needs some data that are not available for "old" seasons, but still. And the results here may be compared to the other methods in this thread, to see how they correspond to each other.
Edit:
The above should basically work for all eras, no matter how high or low scoring. Only thing to adjust for would be GP per season (for teams, i.e. 82 nowadays, 80 during Gretzy's prime).
It should also take care of injuries. If a player misses a game, he simply gets 0 pts for that game. It should be very easy to aggregate different seasons to get career totals (if it wasn't for necessary data missing).
Last edited by plusandminus: 08-23-2011 at 07:47 AM.
Reason: spelling
OK.
I've studied correlations between ES icetime and ESGF+ESGA and they don't necessarily correspond very well, perhaps especially regarding forwards. For example, during 2002-03, I guy like Kowalchuk (who had a reputation as an offensive minded player) had about 1.5 times more ESGF+ESGA played than the average player.
So here we thus have our first bias. ESGF=60 and ESGA=50, will give higher "ice time" than ESGF=50 and ESGA=40, even if real ice time is the same. ?
I'm surprised CYM didn't mention/ask this, but... did you mean he had about 1.5 times the "average" player? or the average of his own team? there are only so many minutes to go around, and I find it unlikely that he would be that far off of his own team's average. If you compare him to another team, sure, his icetime figures would look wonky. but that's not what a model that uses GF & GA to calculate icetime does.
Also, Kovalchuk is about the most extreme example of a "high risk, high reward" type player. Watching the guy play it is clear he is going to cause both more goals for and more goals against while on the ice. If there was ever a player who would be an outlier whose GF/GA figures might cause an estimation model to lie, it would be him. With that said, that's no reason to throw out such a model.
Quote:
If I was asked to rank the following seasonal ES stats for players, without paying any attention at all to context, I would rank them as follow (with a tie for 2nd best):
Comments:
1. The guy with a GF/GA of 3.000 looks far too good compared to the others.
2. The lower numbers, the more extreme GF/GA. (It's a bit like pts per game. The fewer games played, the more extreme points per game.)
That's why I generally think one should be careful with using GF/GA.
I have an even more compelling reason to throw out the bottom player - any calculation done on him would be based on an obscenely low number of game situations. It's a poor sample size and is practically meaningless. Of course, if it got further in the season and he was 40-13, then we'd have something to talk about, and whether he was outperforming the 72-50 player would be a worthy question to ask.
I'm surprised CYM didn't mention/ask this, but... did you mean he had about 1.5 times the "average" player? or the average of his own team? there are only so many minutes to go around, and I find it unlikely that he would be that far off of his own team's average. If you compare him to another team, sure, his icetime figures would look wonky. but that's not what a model that uses GF & GA to calculate icetime does.
Thanks for asking. I compare only with his team's totals.
Below are what it looks like. I have only included players with GF+GA >= 20. (Expect even larger differences for the other players.)
I think the raw data I assembled and corrected should be near 100 % correct.
The rightmost column shows GF+GA per 60 minutes. I have divided all those values by league average, to make it easier to compare.
Highest:
Team
Name
pos
ESTOI
EStoi%
ESGFGA%
diff
ESGFGA
normESGFGAper60min
CAR
BRUNO ST. JACQUES
D
297
0.081
0.138
0.057
35
1.577
CHI
BURKE HENRY
D
250
0.065
0.094
0.029
29
1.552
ATL
ILYA KOVALCHUK
F
1153
0.305
0.379
0.074
130
1.509
NYR
REM MURRAY
F
398
0.105
0.155
0.050
44
1.479
EDM
MIKE COMRIE
F
910
0.241
0.328
0.086
100
1.470
CHI
ERIC DAZE
F
722
0.188
0.257
0.070
79
1.464
LA
ADAM DEADMARSH
F
265
0.071
0.106
0.035
29
1.464
COL
PETER FORSBERG
F
1091
0.288
0.419
0.131
119
1.459
BOS
JOE THORNTON
F
1285
0.340
0.433
0.093
140
1.458
ATL
KIRILL SAFRONOV
D
414
0.110
0.131
0.022
45
1.454
TOR
ALEXANDER MOGILNY
F
943
0.258
0.348
0.090
102
1.447
TOR
KAREL PILAR
D
245
0.067
0.089
0.022
26
1.420
BOS
GLEN MURRAY
F
1400
0.371
0.458
0.087
148
1.414
ATL
DANY HEATLEY
F
1164
0.308
0.359
0.051
123
1.414
NYI
ROMAN HAMRLIK
D
1281
0.355
0.460
0.106
134
1.400
PIT
MARIO LEMIEUX
F
1082
0.286
0.387
0.101
113
1.397
MTL
NIKLAS SUNDSTROM
F
384
0.098
0.129
0.031
40
1.394
PHO
LANDON WILSON
F
328
0.089
0.125
0.036
34
1.387
SJ
NICHOLAS DIMITRAKOS
F
244
0.065
0.089
0.023
25
1.371
BOS
MIKE KNUBLE
F
1061
0.281
0.334
0.053
108
1.362
Lowest:
Team
Name
pos
ESTOI
EStoi%
ESGFGA%
diff
ESGFGA
normESGFGAper60min
WAS
BRIAN SUTHERBY
F
554
0.146
0.096
-0.05
27
0.652
PIT
IAN MORAN
F
1053
0.278
0.175
-0.10
51
0.648
CGY
STEVE BEGIN
F
436
0.118
0.078
-0.04
21
0.644
STL
TYSON NASH
F
532
0.142
0.087
-0.05
25
0.629
TOR
RICHARD JACKMAN
D
535
0.146
0.085
-0.06
25
0.625
FLA
IGOR ULANOV
D
823
0.215
0.146
-0.07
38
0.618
MIN
JASON MARSHALL
F
462
0.119
0.084
-0.04
21
0.608
TB
ALEXANDER SVITOV
F
511
0.132
0.083
-0.05
23
0.602
CAR
JAROSLAV SVOBODA
F
549
0.150
0.095
-0.05
24
0.585
TOR
PAUL HEALEY
F
458
0.125
0.068
-0.06
20
0.584
BUF
VACLAV VARADA
F
525
0.139
0.082
-0.06
22
0.561
CGY
STEVE MONTADOR
D
627
0.170
0.096
-0.07
26
0.555
STL
SHJON PODEIN
F
540
0.144
0.073
-0.07
21
0.520
COL
SERGE AUBIN
F
700
0.185
0.095
-0.09
27
0.516
PIT
IAN MORAN
D
1053
0.278
0.134
-0.14
39
0.496
Biggest differences between real ES%, and ES% estimated by GF+GA:
Team
Name
pos
ESTOI
EStoi%
ESGFGA%
diff
ESGFGA
normESGFGAper60min
PIT
IAN MORAN
D
1053
0.278
0.134
-0.14
39
0.496
COL
PETER FORSBERG
F
1091
0.288
0.419
0.131
119
1.459
NYI
MATTIAS TIMANDER
D
1159
0.321
0.213
-0.11
62
0.716
NYI
ROMAN HAMRLIK
D
1281
0.355
0.460
0.106
134
1.400
PIT
MARIO LEMIEUX
F
1082
0.286
0.387
0.101
113
1.397
PHO
OSSI VAANANEN
D
1056
0.287
0.192
-0.10
52
0.659
PIT
IAN MORAN
F
1053
0.278
0.175
-0.10
51
0.648
STL
PAVOL DEMITRA
F
1151
0.308
0.406
0.098
116
1.348
TB
VINCENT LECAVALIER
F
1134
0.293
0.390
0.097
108
1.274
BOS
JOE THORNTON
F
1285
0.340
0.433
0.093
140
1.458
WAS
SERGEI GONCHAR
D
1584
0.417
0.509
0.092
143
1.208
TOR
ALEXANDER MOGILNY
F
943
0.258
0.348
0.090
102
1.447
CGY
STEPHANE YELLE
F
1054
0.285
0.200
-0.09
54
0.686
COL
SERGE AUBIN
F
700
0.185
0.095
-0.09
27
0.516
BOS
GLEN MURRAY
F
1400
0.371
0.458
0.087
148
1.414
EDM
MIKE COMRIE
F
910
0.241
0.328
0.086
100
1.470
COL
MILAN HEJDUK
F
1219
0.322
0.405
0.083
115
1.262
TB
MARTIN ST. LOUIS
F
1123
0.290
0.372
0.082
103
1.227
FLA
OLLI JOKINEN
F
1168
0.305
0.385
0.080
100
1.146
BOS
SEAN O'DONNELL
D
1150
0.305
0.229
-0.08
74
0.861
VAN
TODD BERTUZZI
F
1223
0.334
0.413
0.079
117
1.280
DET
MATHIEU DANDENAULT
D
1203
0.314
0.392
0.078
122
1.357
PHO
PAUL MARA
D
1129
0.307
0.384
0.077
104
1.233
FLA
IVAN NOVOSELTSEV
F
960
0.250
0.327
0.077
85
1.185
MIN
MARIAN GABORIK
F
1078
0.278
0.355
0.077
89
1.105
ATL
ILYA KOVALCHUK
F
1153
0.305
0.379
0.074
130
1.509
This makes, in my opinion, GF+GA a bit unreliable when using it to determine ice time. Maybe there are steps involved in the calculations that takes care of that, so that it really doesn't matter that much. But I felt a need to point it out.
Quote:
Also, Kovalchuk is about the most extreme example of a "high risk, high reward" type player. Watching the guy play it is clear he is going to cause both more goals for and more goals against while on the ice. If there was ever a player who would be an outlier whose GF/GA figures might cause an estimation model to lie, it would be him. With that said, that's no reason to throw out such a model.
Agree about Kowalchuk. But there were lots of other players like him. And plenty of the opposite kind too, i.e. players with "low risk getting scored on, but also low risk for their opponents to be scored on".
Quote:
I have an even more compelling reason to throw out the bottom player - any calculation done on him would be based on an obscenely low number of game situations. It's a poor sample size and is practically meaningless. Of course, if it got further in the season and he was 40-13, then we'd have something to talk about, and whether he was outperforming the 72-50 player would be a worthy question to ask.
I agree. Unfortunately, the lower numbers the more "extreme" results. And the higher numbers, the less "extreme" results. But problem is that that rule continues to be true even for high numbers. The 40-13 case will still be slightly more vulnerable than the 72-50 case (I think, but that case it might be negligable).
Anyway, I don't like dividing GF by GA, because to me it only makes sense when
GF+GA is equally big for every player.
(But maybe there is something built in in overpass' method, that takes care of that.)
I look at it like this:
A: Player being 40-20 actually helped his team improve by +20.
B: Player being 30-15 actually helped his team improve by +15.
Both have the same GF/GA (2.0). But I think what matters is GF-GA.
Let's say B was 30-12 instead, raising his GF/GA to 2.5. That may make him look like a better player when looking at GF/GA. But he would have helped his team improve by +18.
During the whole season, I think a player's goal is to help their team improve by as many goals as possible, by getting as good GF-GA as possible.
Last edited by plusandminus: 08-23-2011 at 02:21 PM.
I had to edit the part about (and with) the tables in my last post.
(I made a change before posting on the board, so that you would see if player played L, C, R, but that flawed the results, since some forwards played on more than one position, so I had to call all forwards F. If there is some guy playing both D and F, which is rare, the stats for that guy may be unreliable.)
Last edited by plusandminus: 08-23-2011 at 02:34 PM.
plusminus, I'll try to address your concerns about the use of GF/GA ratios.
First, my adjusted plus-minus metric does not use the player's on-ice GF/GA ratio (R-ON) in the calculations. Only the difference is used, as in regular plus-minus.
The R-OFF ratio is used to calculate the team adjustment component, in conjunction with the sum of GA and GF. So the adjustment scales with the number of GF and GA, just as with regular plus-minus
I really don't think your concerns about the use of ratios apply to this metric. The ratios are useful for a quick overview of the on/off ice relationship of the player and team, but are incomplete without a measure of quantity. Adjusted plus-minus includes a measure of quantity.
plusminus, I'll try to address your concerns about the use of GF/GA ratios.
First, my adjusted plus-minus metric does not use the player's on-ice GF/GA ratio (R-ON) in the calculations. Only the difference is used, as in regular plus-minus.
Thanks for clarifying.
Quote:
The R-OFF ratio is used to calculate the team adjustment component, in conjunction with the sum of GA and GF. So the adjustment scales with the number of GF and GA, just as with regular plus-minus
I think I may understand. I think I perhaps need to try it out myself to be sure.
Quote:
I really don't think your concerns about the use of ratios apply to this metric. The ratios are useful for a quick overview of the on/off ice relationship of the player and team, but are incomplete without a measure of quantity. Adjusted plus-minus includes a measure of quantity.
OK.
Sorry if I may have gotten things wrong. I think I need to re-produce myself what you are doing, to be sure I've understood things right.
Thanks for asking. I compare only with his team's totals.
Below are what it looks like. I have only included players with GF+GA >= 20. (Expect even larger differences for the other players.)
I think the raw data I assembled and corrected should be near 100 % correct.
The rightmost column shows GF+GA per 60 minutes. I have divided all those values by league average, to make it easier to compare.
Highest:
Team
Name
pos
ESTOI
EStoi%
ESGFGA%
diff
ESGFGA
normESGFGAper60min
CAR
BRUNO ST. JACQUES
D
297
0.081
0.138
0.057
35
1.577
CHI
BURKE HENRY
D
250
0.065
0.094
0.029
29
1.552
ATL
ILYA KOVALCHUK
F
1153
0.305
0.379
0.074
130
1.509
NYR
REM MURRAY
F
398
0.105
0.155
0.050
44
1.479
EDM
MIKE COMRIE
F
910
0.241
0.328
0.086
100
1.470
CHI
ERIC DAZE
F
722
0.188
0.257
0.070
79
1.464
LA
ADAM DEADMARSH
F
265
0.071
0.106
0.035
29
1.464
COL
PETER FORSBERG
F
1091
0.288
0.419
0.131
119
1.459
BOS
JOE THORNTON
F
1285
0.340
0.433
0.093
140
1.458
ATL
KIRILL SAFRONOV
D
414
0.110
0.131
0.022
45
1.454
TOR
ALEXANDER MOGILNY
F
943
0.258
0.348
0.090
102
1.447
TOR
KAREL PILAR
D
245
0.067
0.089
0.022
26
1.420
BOS
GLEN MURRAY
F
1400
0.371
0.458
0.087
148
1.414
ATL
DANY HEATLEY
F
1164
0.308
0.359
0.051
123
1.414
NYI
ROMAN HAMRLIK
D
1281
0.355
0.460
0.106
134
1.400
PIT
MARIO LEMIEUX
F
1082
0.286
0.387
0.101
113
1.397
MTL
NIKLAS SUNDSTROM
F
384
0.098
0.129
0.031
40
1.394
PHO
LANDON WILSON
F
328
0.089
0.125
0.036
34
1.387
SJ
NICHOLAS DIMITRAKOS
F
244
0.065
0.089
0.023
25
1.371
BOS
MIKE KNUBLE
F
1061
0.281
0.334
0.053
108
1.362
Lowest:
Team
Name
pos
ESTOI
EStoi%
ESGFGA%
diff
ESGFGA
normESGFGAper60min
WAS
BRIAN SUTHERBY
F
554
0.146
0.096
-0.05
27
0.652
PIT
IAN MORAN
F
1053
0.278
0.175
-0.10
51
0.648
CGY
STEVE BEGIN
F
436
0.118
0.078
-0.04
21
0.644
STL
TYSON NASH
F
532
0.142
0.087
-0.05
25
0.629
TOR
RICHARD JACKMAN
D
535
0.146
0.085
-0.06
25
0.625
FLA
IGOR ULANOV
D
823
0.215
0.146
-0.07
38
0.618
MIN
JASON MARSHALL
F
462
0.119
0.084
-0.04
21
0.608
TB
ALEXANDER SVITOV
F
511
0.132
0.083
-0.05
23
0.602
CAR
JAROSLAV SVOBODA
F
549
0.150
0.095
-0.05
24
0.585
TOR
PAUL HEALEY
F
458
0.125
0.068
-0.06
20
0.584
BUF
VACLAV VARADA
F
525
0.139
0.082
-0.06
22
0.561
CGY
STEVE MONTADOR
D
627
0.170
0.096
-0.07
26
0.555
STL
SHJON PODEIN
F
540
0.144
0.073
-0.07
21
0.520
COL
SERGE AUBIN
F
700
0.185
0.095
-0.09
27
0.516
PIT
IAN MORAN
D
1053
0.278
0.134
-0.14
39
0.496
Biggest differences between real ES%, and ES% estimated by GF+GA:
Team
Name
pos
ESTOI
EStoi%
ESGFGA%
diff
ESGFGA
normESGFGAper60min
PIT
IAN MORAN
D
1053
0.278
0.134
-0.14
39
0.496
COL
PETER FORSBERG
F
1091
0.288
0.419
0.131
119
1.459
NYI
MATTIAS TIMANDER
D
1159
0.321
0.213
-0.11
62
0.716
NYI
ROMAN HAMRLIK
D
1281
0.355
0.460
0.106
134
1.400
PIT
MARIO LEMIEUX
F
1082
0.286
0.387
0.101
113
1.397
PHO
OSSI VAANANEN
D
1056
0.287
0.192
-0.10
52
0.659
PIT
IAN MORAN
F
1053
0.278
0.175
-0.10
51
0.648
STL
PAVOL DEMITRA
F
1151
0.308
0.406
0.098
116
1.348
TB
VINCENT LECAVALIER
F
1134
0.293
0.390
0.097
108
1.274
BOS
JOE THORNTON
F
1285
0.340
0.433
0.093
140
1.458
WAS
SERGEI GONCHAR
D
1584
0.417
0.509
0.092
143
1.208
TOR
ALEXANDER MOGILNY
F
943
0.258
0.348
0.090
102
1.447
CGY
STEPHANE YELLE
F
1054
0.285
0.200
-0.09
54
0.686
COL
SERGE AUBIN
F
700
0.185
0.095
-0.09
27
0.516
BOS
GLEN MURRAY
F
1400
0.371
0.458
0.087
148
1.414
EDM
MIKE COMRIE
F
910
0.241
0.328
0.086
100
1.470
COL
MILAN HEJDUK
F
1219
0.322
0.405
0.083
115
1.262
TB
MARTIN ST. LOUIS
F
1123
0.290
0.372
0.082
103
1.227
FLA
OLLI JOKINEN
F
1168
0.305
0.385
0.080
100
1.146
BOS
SEAN O'DONNELL
D
1150
0.305
0.229
-0.08
74
0.861
VAN
TODD BERTUZZI
F
1223
0.334
0.413
0.079
117
1.280
DET
MATHIEU DANDENAULT
D
1203
0.314
0.392
0.078
122
1.357
PHO
PAUL MARA
D
1129
0.307
0.384
0.077
104
1.233
FLA
IVAN NOVOSELTSEV
F
960
0.250
0.327
0.077
85
1.185
MIN
MARIAN GABORIK
F
1078
0.278
0.355
0.077
89
1.105
ATL
ILYA KOVALCHUK
F
1153
0.305
0.379
0.074
130
1.509
This makes, in my opinion, GF+GA a bit unreliable when using it to determine ice time. Maybe there are steps involved in the calculations that takes care of that, so that it really doesn't matter that much. But I felt a need to point it out.
To be honest, I think this reveals how effective a model that uses GF/GA can be.
they actaually do include an adjustment based on the line or pairing the player was on, because as you can see, goals are more frequent on a per-minute basis when the higher minute players are out there. So this will partially mitigate the effect that you're seeing, an effect that is pretty small to begin with.
I mean, if you want to try to approximate icetime across the board from 1967-onwards, be my guest. But can you improve on what exists enough that it is worth the time it would take?
As for Kovalchuk, considering two other Thrashers are in your top-20, he doesn't appear to really be that much of an outlier on his team. He just barely makes the bottom list that you provided, and only because it is based on a "raw" difference and not a percentage difference (if it was based on that, he would not be anywhere near the top)
Quote:
I agree. Unfortunately, the lower numbers the more "extreme" results. And the higher numbers, the less "extreme" results. But problem is that that rule continues to be true even for high numbers. The 40-13 case will still be slightly more vulnerable than the 72-50 case (I think, but that case it might be negligable).
Anyway, I don't like dividing GF by GA, because to me it only makes sense when
GF+GA is equally big for every player.
(But maybe there is something built in in overpass' method, that takes care of that.)
I look at it like this:
A: Player being 40-20 actually helped his team improve by +20.
B: Player being 30-15 actually helped his team improve by +15.
Both have the same GF/GA (2.0). But I think what matters is GF-GA.
Let's say B was 30-12 instead, raising his GF/GA to 2.5. That may make him look like a better player when looking at GF/GA. But he would have helped his team improve by +18.
During the whole season, I think a player's goal is to help their team improve by as many goals as possible, by getting as good GF-GA as possible.
Keep in mind that generally when we talk about r-on and r-ff we are talking about very large sample sizes - seasons at the very least, and generally blocks of seasons, or careers. Like you, I don't see any value in the micro aspect, these 10-20-game segments and single game scenarios you are providing.
overpass, as usual, did a much better job than I could have done, fuddling my way through descriptions of statistics in "normal people" words and terms.
SFrac: Season Fraction. 1.00 is a full season. I prefer it to games played because it gives a 48 game season, a 74 game season, an 80 game season or an 82 game season the same weight. $ESGF: Even-strength goals for, normalized to a 200 ESG scoring environment and with estimated SH goals removed. $ESGA: Even-strength goals against, normalized to a 200 ESG scoring environment and with estimated SH goals removed. R-ON: Even strength GF/GA ratio when the player is on the ice. R-OFF: Even-strength GF/GA ratio when the player is off the ice. XEV+/-: Expected even-strength plus-minus, which is an estimate of the plus-minus that an average player would post with the same teammates. The calculation is described above. EV+/-: Even –strength plus-minus, which is simply plus-minus with estimated shorthanded goals removed and normalized to a 200 ESG environment. AdjEV+/-: Adjusted even-strength plus-minus, which is even-strength plus-minus minus expected even-strength plus-minus. This is the final number.
The following three stats evaluate special teams play and are not related to adjusted plus-minus. I’m including them in the table for a quick reference to the player’s contributions outside of even-strength play. PP% : The % of the team’s power play goals for that the player was on the ice for. SH%: The % of the team’s power play goals against that the player was on the ice for. $PPP/G: Power play points per game, normalized to a 70 PPG environment and with pre-1988 PP assists estimated.
Results
Here are the top 60 in career adjusted even-strength plus-minus, as well as the players in the HOH Top 100 and several others who were strongly considered for voting.
Rk
Player
SFrac
$ESGF/G
$ESGA/G
R-ON
R-OFF
XEV+/-
EV+/-
AdjEV+/-
/Season
PP%
$PPP/G
SH%
1
Ray Bourque
20.30
1.17
0.85
1.37
0.96
-62
524
586
29
88%
0.45
58%
Sorry, but I'm still a bit confused.
The text says $ESGF and $ESGA are adjusted to 200 ESG scoring environment.
The $ prefix makes me think "adjusted to 200 ESG scoring environment".
The text says EV+/- is also adjusted to 200 ESG scoring environment. ?
If it had been named "$EV+/-", I would have been pretty sure it's $ESGF-$ESGA, because $ tells me that it's "adjusted to 200 ESG scoring environment".
My question is... Is EV+/- = $ESGF-$ESGA ?
The text also says:
Quote:
To calculate the adjusted plus-minus, I take the player’s on-ice total goals for and against as given. I calculate an expected plus-minus for the player, based on his team’s off-ice performance.
"As given", does that mean they are not "adjusted to 200 ESG scoring environment"?
Example, using made up seasonal stats for one player and his team.
Everything is ES only.
"without" or "w" means when player was off the ice.
GD (goal difference) = GF-GA.
Lge aver ESGF per team
teamGD
teamGF
teamGA
playerGD
playerGF
playerGA
withoutGD
withoutGF
withoutGA
pGF/pGA
wGF/wGA
100
+20
60
40
+10
24
14
+10
36
26
1.714
1.385
Lge aver ESGF per team
$teamGD
$teamGF
$teamGA
$playerGD
$playerGF
$playerGA
$withoutGD
$withoutGF
$withoutGA
pGF/pGA
wGF/wGA
200
+40
120
80
+20
48
28
+20
72
52
1.714
1.385
Are the $ values above true? Is that how the ""adjusted to 200 ESG scoring environment" values are calculated?
If the above is correct, then how do we use the different variables to calculate the missing columns?
R-On? R-Off?
Can someone write down the formula's for calculating the values, using the variables of the tables above?
And can we write "rOn" and "rOff" instead of "R-On" and "R-Off", to make it easier to understand the formulas? ("-" can otherwise be interpreted as a minus sign. But we don't take R minus On, and R minus Off, right? If, I'm even more confused.)
Quote:
The expected plus-minus is calculated using the off-ice performance regressed partially to even, as a player should be expected to play somewhat better than a set of bad teammates or worse than a set of good teammates.
I understand the regression thing. But I'm confused about other things, including what exactly to regress (I guess it's "wGF/wGA"). ? Is rOff = "wGF/wGA" regressed to even?
Quote:
I then calculate an actual plus-minus, which differs from official NHL plus-minus in that it is normalized to a scoring environment of 200 even-strength goal per season and does not include shorthanded goals. I subtract the “expected plus-minus” from the “actual plus-minus” to generate an adjusted plus-minus number.
Sorry, but I'm still a bit confused.
The text says $ESGF and $ESGA are adjusted to 200 ESG scoring environment.
The $ prefix makes me think "adjusted to 200 ESG scoring environment".
The text says EV+/- is also adjusted to 200 ESG scoring environment. ?
If it had been named "$EV+/-", I would have been pretty sure it's $ESGF-$ESGA, because $ tells me that it's "adjusted to 200 ESG scoring environment".
My question is... Is EV+/- = $ESGF-$ESGA ?
The text also says:
"As given", does that mean they are not "adjusted to 200 ESG scoring environment"?
All numbers are adjusted to the 200 ESG per team-season scoring environment. Yes, EV+/- is simply $ESGF-$ESGA. "As given" still means scoring-adjusted numbers.
Quote:
Originally Posted by plusandminus
Example, using made up seasonal stats for one player and his team.
Everything is ES only.
"without" or "w" means when player was off the ice.
GD (goal difference) = GF-GA.
Lge aver ESGF per team
teamGD
teamGF
teamGA
playerGD
playerGF
playerGA
withoutGD
withoutGF
withoutGA
pGF/pGA
wGF/wGA
100
+20
60
40
+10
24
14
+10
36
26
1.714
1.385
Lge aver ESGF per team
$teamGD
$teamGF
$teamGA
$playerGD
$playerGF
$playerGA
$withoutGD
$withoutGF
$withoutGA
pGF/pGA
wGF/wGA
200
+40
120
80
+20
48
28
+20
72
52
1.714
1.385
Are the $ values above true? Is that how the ""adjusted to 200 ESG scoring environment" values are calculated?
All correct. I don't scoring-level adjust the off-ice numbers, since I only use them to calculate a ratio, but if you did want to adjust for scoring level that would be correct.
Quote:
Originally Posted by plusandminus
If the above is correct, then how do we use the different variables to calculate the missing columns?
R-On? R-Off?
Can someone write down the formula's for calculating the values, using the variables of the tables above?
And can we write "rOn" and "rOff" instead of "R-On" and "R-Off", to make it easier to understand the formulas? ("-" can otherwise be interpreted as a minus sign. But we don't take R minus On, and R minus Off, right? If, I'm even more confused.)
R-ON = $ESGF/$ESGA. For single seasons, multiple seasons, or careers.
R-OFF = (TeamESGF-PlayerESGF)/(TeamESGA-PlayerESGA) for a single season. For multiple seasons, take the sum of the XEV+/-, $ESGF, and $ESGA, and calculate by turning around the formula for XEV+/-
I understand the regression thing. But I'm confused about other things, including what exactly to regress (I guess it's "wGF/wGA"). ? Is rOff = "wGF/wGA" regressed to even?
I think I need an example to understand.
Adjusted plus-minus applies the regression to the ratio rOff in the XEV calculation. It's simply rOff^0.65, which regresses rOff toward 1. If I understand your terms correctly, wGF/wGA = rOff as I present it. I have never presented the regressed rOff value alone.
Full example:
Player X has 60 ESGF, 45 ESGA. His team has 140 ESGF, 155 ESGA. League scoring level is 150 ESGA/team.
OK, I post my results for 2002-03 season using overpass' formulas. I have not adjusted to era, since I only look at one season. I used .65 as the "regress to even" number. Let me know if numbers are wrong. (Please don't overfocus on the number of decimals shown. I want them there for verifying purposes.)
Lowest expected +/-:
Team
Pos
Name
TOIshare
+/-
+/- without player
rOn
rOff
exp+/-
adj+/-
PIT
F
MARIO LEMIEUX
0.2856
-15
-49
0.7656
0.5702
-20.4062
5.4062
CBJ
D
LUKE RICHARDSON
0.4065
-26
-41
0.6709
0.6204
-20.3193
-5.6806
CBJ
D
JAROSLAV SPACEK
0.3544
-19
-48
0.7206
0.5966
-19.4554
0.4554
ATL
F
DANY HEATLEY
0.3079
-1
-50
0.9839
0.6296
-18.3552
17.3552
PIT
F
ALEXEI KOVALEV
0.2415
-4
-60
0.9149
0.5420
-17.6831
13.6831
PIT
D
DICK TARNSTROM
0.2631
-7
-57
0.8600
0.5547
-17.5984
10.5984
CBJ
F
DAVID VYBORNY
0.2859
15
-82
1.5172
0.4810
-17.0432
32.0432
CBJ
F
GEOFF SANDERSON
0.2760
-1
-66
0.9767
0.5417
-16.7163
15.7163
PIT
F
MARTIN STRAKA
0.2428
-6
-58
0.8723
0.5573
-16.5250
10.5250
CAR
D
SEAN HILL
0.3440
5
-54
1.1471
0.5385
-14.4917
19.4917
(Comment: Dominated by Pittsburgh and Columbus players.)
Highest expected +/-:
Team
Pos
Name
TOIshare
+/-
+/- without player
rOn
rOff
exp+/-
adj+/-
COL
F
JOE SAKIC
0.2269
9
45
1.2500
1.5696
11.7839
-2.7839
PHI
D
ERIC WEINRICH
0.3580
15
30
1.3750
1.4688
11.8073
3.1926
PHI
D
KIM JOHNSSON
0.3605
16
29
1.3902
1.4603
11.9996
4.0004
DAL
D
RICHARD MATVICHUK
0.2767
-5
57
0.8611
1.8028
12.6784
-17.6784
COL
D
ROB BLAKE
0.3652
19
35
1.4524
1.4795
13.0408
5.9591
DAL
D
DERIAN HATCHER
0.4141
27
25
1.5870
1.4098
13.2289
13.7710
VAN
D
BRENT SOPEL
0.3782
-10
35
0.8246
1.4861
13.3167
-23.3168
COL
D
ADAM FOOTE
0.3842
27
27
1.5510
1.4091
13.8747
13.1252
DAL
D
SERGEI ZUBOV
0.3819
17
35
1.4048
1.5385
14.0487
2.9512
COL
F
STEVEN REINPRECHT
0.2496
-9
63
0.8125
1.9403
18.4572
-27.4572
COL
D
GREG DE VRIES
0.4051
16
38
1.2759
1.6667
21.7152
-5.7152
Comment: Some Colorado, some Philadelphia, some Dallas.
Best adjusted +/-:
Team
Pos
Name
TOIshare
+/-
+/- without player
rOn
rOff
exp+/-
adj+/-
COL
F
PETER FORSBERG
0.2885
55
-1
2.7188
0.9880
-0.4688
55.4688
COL
F
MILAN HEJDUK
0.3223
49
5
2.4848
1.0610
2.2119
46.7881
DET
D
NICKLAS LIDSTROM
0.3878
36
1
1.7500
1.0112
0.4793
35.5207
LA
F
ZIGMUND PALFFY
0.3235
24
-28
1.6316
0.7228
-10.5125
34.5125
PHO
F
LADISLAV NAGY
0.3002
25
-26
1.8929
0.7593
-7.2309
32.2309
COL
F
ALEX TANGUAY
0.3104
38
16
2.1875
1.1928
5.8372
32.1627
CBJ
F
DAVID VYBORNY
0.2859
15
-82
1.5172
0.4810
-17.0432
32.0432
DAL
F
JERE LEHTINEN
0.2803
33
19
2.4348
1.2262
5.2278
27.7722
STL
F
ERIC BOGUNIECKI
0.2453
24
-10
1.8889
0.9083
-2.4386
26.4385
BOS
F
MIKE KNUBLE
0.2811
22
-11
1.5116
0.9027
-3.5934
25.5934
PHO
F
DAYMOND LANGKOW
0.3225
19
-20
1.5588
0.8039
-6.1608
25.1607
MTL
D
ANDREI MARKOV
0.3408
18
-26
1.4865
0.7869
-7.1518
25.1517
Comment: Similar to the method I used and posted yesterday (post #184). Forsberg atop here too. Hejduk higher, as is Tanguay, which I think is not so good. Lidstrom higher. Vyborny lower, I think he should be higher, as I think +15 with vs -82 "without" is a huge difference. Palffy, Lehtinen, Boguniecki on my list too.
Martin St Louis, who was high on my list(s), is missing here. He's 25th here. Perhaps it has to do with how he distributed his ESGF and ESGF game by game.
Best adjusted (if not "regressing R-Off to even"):
Team
Pos
Name
TOIshare
+/-
+/- without player
rOn
rOff
exp+/-
adj+/-
COL
F
PETER FORSBERG
0.2885
55
-1
2.7188
0.9880
-0.7212
55.7212
COL
F
MILAN HEJDUK
0.3223
49
5
2.4848
1.0610
3.4023
45.5976
CBJ
F
DAVID VYBORNY
0.2859
15
-82
1.5172
0.4810
-25.58120
40.5812
LA
F
ZIGMUND PALFFY
0.3235
24
-28
1.6316
0.7228
-16.0919
40.0919
PHO
F
LADISLAV NAGY
0.3002
25
-26
1.8929
0.7593
-11.0842
36.0842
DET
D
NICKLAS LIDSTROM
0.3878
36
1
1.7500
1.0112
0.7374
35.26257
COL
F
ALEX TANGUAY
0.3104
38
16
2.1875
1.1928
8.9670
29.0330
MTL
D
ANDREI MARKOV
0.3408
18
-26
1.4865
0.7869
-10.9724
28.9724
NYI
D
ROMAN HAMRLIK
0.3548
16
-15
1.2712
0.8256
-12.8025
28.8025
PHO
F
DAYMOND LANGKOW
0.3225
19
-20
1.5588
0.8039
-9.4565
28.4565
STL
F
ERIC BOGUNIECKI
0.2453
24
-10
1.8889
0.9083
-3.7500
27.7500
BOS
F
MIKE KNUBLE
0.2811
22
-11
1.5116
0.9027
-5.5256
27.5255
ATL
F
DANY HEATLEY
0.3079
-1
-50
0.9839
0.6296
-27.9545
26.9545
CAR
D
SEAN HILL
0.3440
5
-54
1.1471
0.5385
-21.9000
26.9000
TB
D
DAN BOYLE
0.3600
13
-22
1.2889
0.7755
-13.0229
26.0229
Comment: At first glance it looks "better", in that linemates appear a bit more separated. But experience has shown me that regressing "when player off the ice" to even usually gives better results. (By the way, 6 Europeans atop.)
Although I like to compare the results to "my" mentioned method (see post #184), my method is not perfect. I need to find a way to include games where the player played but wasn't on the ice on any ES goals either way. Plus think more about it. Good things with it are that it doesn't need to pay attention to ice time, and doesn't have to adjust to "different GPG in different eras".
Now that I may know how overpass have done the calculations, and have been able to (hopefully) reproduce the results, my impression is that overpass' technique gives good results considered how relatively simple it is.
By simple, I mean that it only depends on ESGF and ESGA (and adjustment for ESGF per season), and that basically the only "tricky" part was the formula to calculate the expected +/-. That formula seem to produce interestingly good results.
(I did however not understand how it was done until it was written down in detail.)
The things I "don't like" about it, may be mostly present when looking at single seasons. When aggregating seasons, it might become alright (as players will play on other lines, on teams differently strong, etc). That is also what overpass have said.
Thanks overpass for explaining more in detail. It currently seems as if most of my "suspicions" regarding your method may have been a bit overstated.
Hopefully I will be able to reproduce czechyourmath's method too... eventually.
Last edited by plusandminus: 08-26-2011 at 06:55 PM.
I have an alternative that might be fairer to players on great teams without using somewhat arbitrary regressions to the mean. It's not exactly comparable to adjusted plus-minus, but it uses much of the same methodology. For lack of a better term, I might call it "even strength value". It has two primary components:
1. Player's share of team success at even strength
2. Player's marginal (additional) success at even strength
Once you calculate each component, simply add them together.
Player's share of team success at ES is calculated as:
where Team Expected ES Win % = (ESGF)^N / (ESGF^N + ESGA^N)
this is the pythagorean win formula; N = 2 (or another number if supported by data)
Player's marginal contribution to team success is calculated as follows:
Subtract the player's ESGF and ESGA from the team's totals.
Recalculate the ES Win % from the new numbers (this is ES Win % without player).
Subtract Team Exp. ES Win % from ES Win % without player.
Multiply the difference in Win % by 82 to yield player's marginal contribution.
Then add player's share of team success ES and player's marginal contribution to team success at ES to get "ES Value" (whatever is proper term). The results are in the same ballpark as plus-minus.
Is the above still true? Or did you come up with changes to improve further?
Anyway, I tried to calculate according to how I interpreted the instructions, and below are my results for the 2002-03 season.
Columns starting with "x" is "without" player, i.e. teamStat - playerStat. GS = goal sum (GF+GA).
I haven't multiplied anything by 82. Should be OK anyway, right?
You are welcome to check my math for errors and/or misunderstandings.
Team
Pos
Name
GF
GA
GS
xGS
xGF
xGA
+/-
x+/-
playerWin
xWin
teamWin
pmargCont
Win
COL
F
PETER FORSBERG
87
32
119
109
82
83
55
-1
0.262331
0.286398
0.683506
0.419013
0.681344
COL
F
MILAN HEJDUK
82
33
115
103
87
82
49
5
0.247891
0.276771
0.683506
0.404928
0.652819
DAL
D
DERIAN HATCHER
73
46
119
79
86
61
27
25
0.204417
0.307920
0.688292
0.447368
0.651785
COL
D
ADAM FOOTE
76
49
125
81
93
66
27
27
0.194943
0.300838
0.683506
0.440139
0.635082
COL
D
GREG DE VRIES
74
58
132
70
95
57
16
38
0.168469
0.317685
0.683506
0.464787
0.633256
COL
F
ALEX TANGUAY
70
32
102
92
99
83
38
16
0.221417
0.245484
0.683506
0.359154
0.580571
WAS
D
SERGEI GONCHAR
80
63
143
32
68
70
17
-2
0.063001
0.281536
0.553229
0.508895
0.571896
DET
D
NICKLAS LIDSTROM
84
48
132
73
90
89
36
1
0.144899
0.262009
0.617310
0.424436
0.569335
DAL
D
PHILIPPE BOUCHER
60
38
98
74
99
69
22
30
0.191479
0.253581
0.688292
0.368420
0.559899
DAL
D
SERGEI ZUBOV
59
42
101
69
100
65
17
35
0.178541
0.261343
0.688292
0.379697
0.558238
PHI
D
KIM JOHNSSON
57
41
98
61
92
63
16
29
0.162122
0.260459
0.672411
0.387350
0.549472
COL
D
ROB BLAKE
61
42
103
73
108
73
19
35
0.175689
0.247891
0.683506
0.362675
0.538364
PHI
D
ERIC WEINRICH
55
40
95
60
94
64
15
30
0.159465
0.252486
0.672411
0.375493
0.534958
DAL
F
MIKE MODANO
55
30
85
77
104
77
25
27
0.199242
0.219942
0.688292
0.319547
0.518789
DAL
F
JERE LEHTINEN
56
23
79
85
103
84
33
19
0.219942
0.204417
0.688292
0.296991
0.516933
PHI
D
ERIC DESJARDINS
54
29
83
70
95
75
25
20
0.186042
0.220593
0.672411
0.328062
0.514104
OTT
D
WADE REDDEN
62
44
106
60
101
77
18
24
0.136208
0.240635
0.644722
0.373238
0.509446
BOS
F
GLEN MURRAY
83
65
148
29
84
91
18
-7
0.047945
0.244688
0.534016
0.458203
0.506148
DET
D
MATHIEU DANDENAULT
70
52
122
55
104
85
18
19
0.109170
0.242160
0.617310
0.392282
0.501452
COL
D
DEREK MORRIS
55
34
89
75
114
81
21
33
0.180503
0.214197
0.683506
0.313379
0.493882
Forsberg atop here too. But I think there are far too much Colorado dominance at the top. Basically, it seems to list the players with highest ESGF+ESGA on the teams.
We see some familiar names from the other two methods, like Hejduk, Lehtinen, Tanguay, but also many new.
Have I missed something in my calculations?
While I think "my" method and overpass' method ended up with quite similar results, I think this method gives the most "different" results. That does not necessarily have to bad, but looking at the table it does not seem to care much about "how good" the player played. Guys like Foote and DeVries don't look special +/- wise when comparing them to how Colorado did when they were off the ice.
The list is very much dominated by defencemen.
The only forwards on the list are: Forsberg-Hejduk-Tanguay, Modano-Lethinen and G.Murray. Among forwards will soon follow Bertuzzi-Naslund-Morrison (in between them are a few other forwards), all close to each other.
Shoudn't there be some consideration paid to GF-GA, or GF/(GF+GA), or even GF/GA?
Maybe I've missed something?
Edit: By the way, some guys ended up with slightly negative numbers. Is that OK?
Worst:
Team
Pos
Name
GF
GA
GS
xGS
xGF
xGA
+/-
x+/-
playerWin
xWin
teamWin
pmargCont
Win
CBJ
F
KENT MCDONELL
0
1
1
-68
120
186
-1
-66
-0.064606
0.000950
0.291681
0.003256
-0.061350
CBJ
F
MATHIEU DARCHE
0
1
1
-68
120
186
-1
-66
-0.064606
0.000950
0.291681
0.003256
-0.061350
Edit:
I experimented a bit more.
teamWin = team win formula
xWin = appplying win formula but with "without" stats instead of team stats. ("Without"=team-player.)
Then the differences between the two.
playerWin = applying win formula but with player stats instead of team stats. Gives strange results for players with low numbers.
The results below looks far "better" than the ones above.
One thing I suspect is still missing, is to add something more to it. I think we know below much "difference" the player did, but I think there might be something more added? (Perhaps something to do with (playerGF+playerGA) / (teamGF+teamGA)?? I'm very tired now, by will continue probably tomorrow.
Team
Pos
Name
GF
GA
GS
xGS
xGF
xGA
+/-
x+/-
teamWin
xWin
Diff
Diff%
playerWin
COL
F
PETER FORSBERG
87
32
119
109
82
83
55
-1
0.683506
0.493939
0.189567
1.383786
0.880833
COL
F
MILAN HEJDUK
82
33
115
103
87
82
49
5
0.683506
0.529559
0.153947
1.290707
0.860616
LA
F
ZIGMUND PALFFY
62
38
100
20
73
101
24
-28
0.485404
0.343142
0.142262
1.414586
0.726928
PHO
F
LADISLAV NAGY
53
28
81
24
82
108
25
-26
0.496310
0.365673
0.130637
1.357250
0.781797
DET
D
NICKLAS LIDSTROM
84
48
132
73
90
89
36
1
0.617310
0.505586
0.111724
1.220979
0.753846
CBJ
F
DAVID VYBORNY
44
29
73
-52
76
158
15
-82
0.291681
0.187898
0.103783
1.552336
0.697155
PHO
F
DAYMOND LANGKOW
53
34
87
18
82
102
19
-20
0.496310
0.392573
0.103737
1.264248
0.708448
NAS
D
JASON YORK
49
35
84
4
72
96
14
-24
0.460379
0.360000
0.100379
1.278830
0.662162
NYI
D
ROMAN HAMRLIK
75
59
134
17
71
86
16
-15
0.503436
0.405322
0.098114
1.242064
0.617724
STL
F
ERIC BOGUNIECKI
51
27
78
38
99
109
24
-10
0.548834
0.452033
0.096801
1.214145
0.781081
COL
F
ALEX TANGUAY
70
32
102
92
99
83
38
16
0.683506
0.587237
0.096269
1.163935
0.827143
TB
D
DAN BOYLE
58
45
103
4
76
98
13
-22
0.467543
0.375552
0.091991
1.244948
0.624234
MTL
D
ANDREI MARKOV
55
37
92
10
96
122
18
-26
0.474210
0.382406
0.091804
1.240069
0.688438
MIN
F
PASCAL DUPUIS
46
30
76
17
80
95
16
-15
0.503984
0.414910
0.089074
1.214682
0.701591
CAR
D
SEAN HILL
39
34
73
-44
63
117
5
-54
0.313326
0.224770
0.088556
1.393984
0.568173
LA
F
ALEXANDER FROLOV
49
33
82
12
86
106
16
-20
0.485404
0.396951
0.088453
1.222831
0.687965
DAL
F
JERE LEHTINEN
56
23
79
85
103
84
33
19
0.688292
0.600566
0.087726
1.146072
0.855661
BOS
F
MIKE KNUBLE
65
43
108
33
102
113
22
-11
0.534016
0.448970
0.085046
1.189424
0.695587
TB
F
MARTIN ST. LOUIS
57
46
103
2
77
97
11
-20
0.467543
0.386556
0.080987
1.209509
0.605591
STL
D
AL MACINNIS
69
50
119
33
81
86
19
-5
0.548834
0.470086
0.078748
1.167518
0.655694
Results looks much more similar to the other methods (those "by overpass" and "by me").
Dividing instead gives different results, see below. But those above are "better", right? ?
Team
Pos
Name
GF
GA
GS
xGS
xGF
xGA
+/-
x+/-
teamWin
xWin
Diff
Diff%
playerWin
CBJ
F
DAVID VYBORNY
44
29
73
-52
76
158
15
-82
0.291681
0.187898
0.103783
1.552336
0.697155
LA
F
ZIGMUND PALFFY
62
38
100
20
73
101
24
-28
0.485404
0.343142
0.142262
1.414586
0.726928
CAR
D
SEAN HILL
39
34
73
-44
63
117
5
-54
0.313326
0.224770
0.088556
1.393984
0.568173
COL
F
PETER FORSBERG
87
32
119
109
82
83
55
-1
0.683506
0.493939
0.189567
1.383786
0.880833
PHO
F
LADISLAV NAGY
53
28
81
24
82
108
25
-26
0.496310
0.365673
0.130637
1.357250
0.781797
COL
F
MILAN HEJDUK
82
33
115
103
87
82
49
5
0.683506
0.529559
0.153947
1.290707
0.860616
CBJ
F
GEOFF SANDERSON
42
43
85
-68
78
144
-1
-66
0.291681
0.226845
0.064836
1.285816
0.488236
PIT
F
ALEXEI KOVALEV
43
47
90
-68
71
131
-4
-60
0.290868
0.227051
0.063817
1.281069
0.455643
NAS
D
JASON YORK
49
35
84
4
72
96
14
-24
0.460379
0.360000
0.100379
1.278830
0.662162
FLA
D
ANDREAS LILJA
36
29
65
-31
75
120
7
-45
0.356902
0.280898
0.076004
1.270575
0.606457
PHO
F
DAYMOND LANGKOW
53
34
87
18
82
102
19
-20
0.496310
0.392573
0.103737
1.264248
0.708448
ATL
F
DANY HEATLEY
61
62
123
-52
85
135
-1
-50
0.354528
0.283889
0.070639
1.248826
0.491870
Last edited by plusandminus: 08-24-2011 at 05:47 PM.
Reason: adding more text
This is amazing stuff, very insightful! Do you have / are you willing to share the year-to-year spreadsheets? A friend of mine and I are trying to rank the best players since 1990, and this information would be very useful.
Last edited by Sixbladeknife: 10-01-2012 at 09:37 AM.
Reason: Typo
This is amazing stuff, very insightful! Do you have / are you willing to share the year-to-year spreadsheets? A friend of mine and I are trying to rank the best players since 1990, and this information would be very useful.
I don't fully endorse the single season plus-minus ratings as significant. There's a lot of random variation still at the season level. But I do find it useful for looking at groups of seasons like, say, Gretzky's Edmonton years compared to his LA years. Or for looking peak seasons (over several years) for any player.