HFBoards

Go Back   HFBoards > General Hockey Discussion > By The Numbers
Mobile Hockey's Future Become a Sponsor Site Rules Support Forum vBookie Page 2
By The Numbers Hockey Analytics... the Final Frontier. Explore strange new worlds, to seek out new algorithms, to boldly go where no one has gone before.

Adjusted stats - how valuable?

Reply
 
Thread Tools
Old
09-25-2012, 04:23 AM
  #51
habsfanatics*
 
Join Date: May 2012
Posts: 4,984
vCash: 500
I never really liked adjustment stats, for the reason Dalton suggests and many others. I believe the outliers definitely throw the use of means into a grey area.

If the league gets tougher to score, does it hurt Wayne/Mario in the same way it hurts the Craig Simpson or others. I don't think so. My problem isn't the stats themselves, it's the mischaracterization and misrepresentation I've seen on these boards.

Iain says it doesn't happen, but it most certainly does. I've debated in the "How many points would Gretzky score today threads" and posters have told me he would score precisely 158 points or whatever the number was, because there was a formula for that, LOL. Not only that, but certain posters use these numbers in every thread as their only source of reasoning. I won't mention his name, but one in particular poster uses them without even understanding them at all.

For an adjustment that is suppose to be used for putting things in context, it often lacks context itself.

My opinion.

habsfanatics* is offline   Reply With Quote
Old
09-25-2012, 08:12 AM
  #52
ssh
Registered User
 
Join Date: May 2008
Posts: 94
vCash: 500
Quote:
Originally Posted by habsfanatics View Post
If the league gets tougher to score, does it hurt Wayne/Mario in the same way it hurts the Craig Simpson or others. I don't think so.
Of course every player is affected differently, depending on how effective their individual abilities are in a different environment, i.e. team strategies/player usage, goaltending (size, skills/technique, equipment etc.), opposing skaters (size, physical abilities/conditioning, skills etc.), teammates and so on.

It would be silly to argue, though, that any player, including Gretzky, Lemieux and Orr couldn't be slowed down at all by any improvements in team defenses.

Quote:
I've debated in the "How many points would Gretzky score today threads" and posters have told me he would score precisely 158 points or whatever the number was, because there was a formula for that, LOL.
Many people also give estimates, both higher and lower, without any kind of reasoning beyond "look at how many points he scored!" and "Gretzky was small and weak and players are so much better now".

I do agree that many people mischaracterize "adjusted" stats.

ssh is offline   Reply With Quote
Old
09-25-2012, 11:08 AM
  #53
Iain Fyffe
Hockey fact-checker
 
Iain Fyffe's Avatar
 
Join Date: Feb 2009
Location: Fredericton, NB
Country: Canada
Posts: 3,079
vCash: 500
Quote:
Originally Posted by habsfanatics View Post
My problem isn't the stats themselves, it's the mischaracterization and misrepresentation I've seen on these boards.

Iain says it doesn't happen, but it most certainly does.
No, that's not what I said. (So, ironically, you've mischaraterized my posts.)

I said that's not what adjusted stats mean, and therefore agree that it's a mischaracterization.

But when someone mischaracterizes the results of something, do you blame the something, or do you blame the mischaracterizer? If someone doesn't understand adjusted stats, they shouldn't be using them. But that's not the fault of the adjustment.


Last edited by Iain Fyffe: 09-25-2012 at 11:16 AM.
Iain Fyffe is offline   Reply With Quote
Old
09-28-2012, 10:56 AM
  #54
Dalton
Registered User
 
Dalton's Avatar
 
Join Date: Aug 2009
Location: Ho Chi Minh City
Country: Vietnam
Posts: 2,096
vCash: 500
Quote:
Originally Posted by Iain Fyffe View Post
With respect to adjusted scoring mean-based, I don't think that's actually accurate. References to bell curves are out of place as well. Adjusted scoring does not care about the distribution of goals among players. If it's a bell curve or a power-law curve, the same adjustment will be applied.

For instance:



Here the actual results are in blue. The red is the curve transformed by traditional adjustment (assuming the scoring level is adjusted downward). The purple curve is what should happen with percentile-based adjustment, where the best players are less affected than other players.

Adjusted scoring does not assume a bell curve, and it doesn't use a bell curve. Its adjustments are distribution-neutral.
I am not claiming that I have the method and nothing else works. You and others have made reference to bad methods. I still have no idea what particular method(s) you are using to calculate the adjusted data on your graph. I have no idea what you used to create your curves. Well I don't know explicitly.

I have said that the study only applies when one uses means in their calculations. If you aren't then we have no debate. If you are arguing for means then you need to take it up with the study's authors. They have shown that using means leads to errors of several magnitudes beyond not using means. They used hockey data for one case. Prove that wrong and be somewhat famous.

It appears that your adjusted curve has a different rate of change than the raw data or percentile data. I'm reminded of the differences between predictions of planetary positions from Keppler (hand breadth) to Newton (finger tip) to Einstein (equipment doesn't exist). The adjusted data looks out of place on the graph. Perhaps if you took account of the outliers your curve would better mirror the other two curves. The real data in particular. Maybe that's an optical illusion because you haven't labelled your axis or even defined them.

Not good math.

Without knowing how you calculated your curves I can only make the following comments.

The fact the original data fits a power curve explains why the adjusted data does. It doesn't prove anything about using means.

The fact your adjusted curve converges towards the real data and then diverges from it seems to suggest a 'means' effect. You have a bell curve added to a power curve?

It seems that using percentages to mine data has led to some observations comparing the two seasons. Revealing different strategies, roles or reflecting injuries?

Not labelling your graph leaves it open to many interpretations. I'm reminded of the 'propensity to spend' graph from my lone economics course many years ago. But at least the axis were labelled.

I'll try to add a third season tomorrow (my time) and see what pops up.

In no way am I trying to say I've come up with Shakespeare. Just a different POV that I think maintains the integrity of the raw data. Normaliztion fails. Read the study. I'm just suggesting a way to avoid it.

Looking at other factors such as TOI, gp or position may be more revealing but let's focus on one variable at a time at first.

Dalton is offline   Reply With Quote
Old
09-28-2012, 11:13 AM
  #55
TheDevilMadeMe
Global Moderator
 
TheDevilMadeMe's Avatar
 
Join Date: Aug 2006
Location: Brooklyn
Country: United States
Posts: 43,222
vCash: 500
Quote:
Originally Posted by Dalton View Post
I am not claiming that I have the method and nothing else works. You and others have made reference to bad methods. I still have no idea what particular method(s) you are using to calculate the adjusted data on your graph. I have no idea what you used to create your curves. Well I don't know explicitly.
I assume his curve is just a graph of all the point totals of all the players in the NHL, and the curves next to it show what the graph would look like if you use the "adjustement" methods discussed in this thread.

Edit: Actually I don't know the source of the curve; he doesn't really say. I think his point was just to show that if you apply adjustments to a power curve, it doesn't change the shape of the curve.

Quote:
I have said that the study only applies when one uses means in their calculations. If you aren't then we have no debate. If you are arguing for means then you need to take it up with the study's authors. They have shown that using means leads to errors of several magnitudes beyond not using means. They used hockey data for one case. Prove that wrong and be somewhat famous.
Here is how traditional adjusted stats are calculated:
http://www.hockey-reference.com/abou...ted_stats.html


Last edited by TheDevilMadeMe: 09-28-2012 at 11:21 AM.
TheDevilMadeMe is offline   Reply With Quote
Old
09-28-2012, 01:01 PM
  #56
Dalton
Registered User
 
Dalton's Avatar
 
Join Date: Aug 2009
Location: Ho Chi Minh City
Country: Vietnam
Posts: 2,096
vCash: 500
Quote:
Originally Posted by TheDevilMadeMe View Post
I assume his curve is just a graph of all the point totals of all the players in the NHL, and the curves next to it show what the graph would look like if you use the "adjustement" methods discussed in this thread.

Edit: Actually I don't know the source of the curve; he doesn't really say. I think his point was just to show that if you apply adjustments to a power curve, it doesn't change the shape of the curve.



Here is how traditional adjusted stats are calculated:
http://www.hockey-reference.com/abou...ted_stats.html
Thank you. I have seen that. It's what set me off.

I don't like the 82/70. I prefer the fraction of the leagues goals he scored. I like to compare that to his peers in increasing groups. I think that better reflects his performance. I don't have stats for that year. I'll look for them and make a comparison. I'll put them up here.

I think using fractions does away with the very, very dubious roster adjustment and the demonstrably failed normalization to 6 gpg.

I dl'd 2009/10 and 2008/09

total goals top 5% 10% 10-20% 20-30% 30-40%
2008/09 7006 0.212 0.375 0.231 0.158 0.099
2009/10 6803 0.215 0.366 0.232 0.154 0.107
2010/11 6721 0.214 0.368 0.232 0.160 0.103
2011/12 6542 0.229 0.390 0.238 0.150 0.092

I looked at gp as a fraction as well since someone brought it up with good reasoning.

GP as pct of total.
2008/09 44258 0.076 0.157 0.149 0.142 0.130
2009/10 44271 0.078 0.154 0.150 0.142 0.128
2010/11 44262 0.079 0.155 0.152 0.143 0.137
2011/12 44268 0.082 0.160 0.154 0.143 0.134

I see decreasing goals scored league wide with increasing games played by the top 5% and 10-20% groupings The 20-30% has changed little while the 30-40% group seems eratic.

GS- I see last years increase among the top 5%,10% and 10-20% groups of scorers. The 20-30% and 30-40% were down last year.

There is some corelation between GP and GS but it's not a linearly proportional increase among groupings of 10%

What I really see is using league gpg to calculate a guys value is just too simplistic. There are variations in the distributions of goals amongst the 10% groups I've chosen. Surely there are variations amongst the members. I've tried to allude to that by including the 5% group.

I don't think 65 goals is a true measure of Howe's value in the example of that site. I think we have to compare him to his peers. I think his true value is a curve not a point and it depends on how fine you wish to be. He scored a certain percentage of goals amongst the top 5, 10 or 20% of his peers in his day. I think his value in today's game is better determined by deciding what size group you're comparing him too. I think the size needs to be small enough to attempt to maximize outliers. Perhaps we might even say that among those who've scored say 25% of all the goals Howe has a value of...

This isn't my method or the method. I'm just using fractions. It's just math. I think it works better.



Edit: Reflecting on this makes me think that we should be talking in terms of probabilities rather than constants. I added 2008/09 season and so had to edit my comments. Better you draw your own conclusions.


Last edited by Dalton: 09-28-2012 at 02:16 PM.
Dalton is offline   Reply With Quote
Old
09-28-2012, 03:05 PM
  #57
Trebek
Mod Supervisor
 
Trebek's Avatar
 
Join Date: Sep 2005
Posts: 2,945
vCash: 500
Quote:
Originally Posted by Dalton View Post
Edit: Reflecting on this makes me think that we should be talking in terms of probabilities rather than constants.
Can you tie this statement in with everything? I'm not following.

Trebek is offline   Reply With Quote
Old
09-28-2012, 03:23 PM
  #58
Iain Fyffe
Hockey fact-checker
 
Iain Fyffe's Avatar
 
Join Date: Feb 2009
Location: Fredericton, NB
Country: Canada
Posts: 3,079
vCash: 500
Quote:
Originally Posted by Dalton View Post
I am not claiming that I have the method and nothing else works. You and others have made reference to bad methods. I still have no idea what particular method(s) you are using to calculate the adjusted data on your graph. I have no idea what you used to create your curves. Well I don't know explicitly.
They're for illustrative purposes only. It shows hypothetical raw numbers that form a power-law curve, then the transformation that a traditional adjustment (adjusting points based on the ratio of the league goals per game to some arbitrary number) creates, and also the transformation that a percentage adjustment, as you suggest, would presumably result in.

The point is, you criticize traditional adjusted scoring for being a mean-based analysis, but it's not. Not any more than your percentage adjustment would be, so far as I can tell.

Quote:
Originally Posted by Dalton View Post
I have said that the study only applies when one uses means in their calculations. If you aren't then we have no debate. If you are arguing for means then you need to take it up with the study's authors. They have shown that using means leads to errors of several magnitudes beyond not using means. They used hockey data for one case. Prove that wrong and be somewhat famous.
Look at the image on the first page of that study you linked to:



They're arguing that if an analysis of hockey players assumes that performance clusters around a mean (ie, has a normal distribution or a bell curve), then it will be flawed.

Using a mean to make a calculation is not the same as assuming the results cluster around said mean. As my graph was intended to illustrate, adjusted scoring makes no such assumption. The results of adjusted scoring follow a power-law curve, not a normal distribution. Therefore this study is not relevant, because no normal distribution is assumed.

Do you see the difference now?

Quote:
Originally Posted by Dalton View Post
Maybe that's an optical illusion because you haven't labelled your axis or even defined them.
Or maybe it was a quick illustration, and not a calculation.

Quote:
Originally Posted by Dalton View Post
The fact the original data fits a power curve explains why the adjusted data does. It doesn't prove anything about using means.
This suggests you don't understand the studies you're linking to, or that you don't understand adjusted scoring. Adjusted scoring does not assume that performance clusters around a mean.

If you ran an adjusted scoring study that resulted in a normal distribution of adjusted points, you would have a problem, as the study points out. But adjusted scoring does not result in a normal distribution.

Quote:
Originally Posted by Dalton View Post
In no way am I trying to say I've come up with Shakespeare. Just a different POV that I think maintains the integrity of the raw data. Normaliztion fails. Read the study. I'm just suggesting a way to avoid it.
Adjusted scoring does not normalize. It transforms. Normalizing suggests, essentially, regressing scoring totals to the mean, to make the curve closer to a normal distribution. That's not what adjusted scoring does. It applies essentially the same multiplier to everyone's scoring totals. Looks at HR's explanation. Every player gets the same games played adjustment, the same roster size adjustment. The only small difference is that it removes the individual's totals in calculating the league average, which actually means that the better players are adjusted less than other players (on a downward adjustment), just as you suggest they should be.

Read the study, as you say. The study criticizes the tendency to assume a normal distribution, and the habit of making adjustments in order to better fit that assumed curve, such as ignoring outliers. But adjusted scoring does not do this, and thus this criticism is moot.

Also, read the method for adjusted scoring. You're criticizing it for things that it does not do.

Iain Fyffe is offline   Reply With Quote
Old
09-28-2012, 03:25 PM
  #59
Iain Fyffe
Hockey fact-checker
 
Iain Fyffe's Avatar
 
Join Date: Feb 2009
Location: Fredericton, NB
Country: Canada
Posts: 3,079
vCash: 500
Quote:
Originally Posted by TheDevilMadeMe View Post
Edit: Actually I don't know the source of the curve; he doesn't really say. I think his point was just to show that if you apply adjustments to a power curve, it doesn't change the shape of the curve.
This is exactly correct. It's an illustration intended to show the effects on adjustment on the curve of NHL player point totals.

Adjusted scoring does not normalize the curve.

Iain Fyffe is offline   Reply With Quote
Old
09-28-2012, 03:33 PM
  #60
Iain Fyffe
Hockey fact-checker
 
Iain Fyffe's Avatar
 
Join Date: Feb 2009
Location: Fredericton, NB
Country: Canada
Posts: 3,079
vCash: 500
Quote:
Originally Posted by Dalton View Post
I don't like the 82/70. I prefer the fraction of the leagues goals he scored. I like to compare that to his peers in increasing groups. I think that better reflects his performance. I don't have stats for that year. I'll look for them and make a comparison. I'll put them up here.
In 1952/53, Gordie Howe scored 4.87% of all goals in the NHL. In 2011/12, Steven Stamkos scored 0.89% of all goals in the NHL. Where do we go from there?

Quote:
Originally Posted by Dalton View Post
I think using fractions does away with the very, very dubious roster adjustment and the demonstrably failed normalization to 6 gpg.
The roster adjustment is just a scaling factor, and as I explained, there is no normalization.

Quote:
Originally Posted by Dalton View Post
I think his true value is a curve not a point and it depends on how fine you wish to be. He scored a certain percentage of goals amongst the top 5, 10 or 20% of his peers in his day. I think his value in today's game is better determined by deciding what size group you're comparing him too. I think the size needs to be small enough to attempt to maximize outliers. Perhaps we might even say that among those who've scored say 25% of all the goals Howe has a value of...
I'm going to suggest you try this paragraph again. I have no idea what you're trying to say.

Iain Fyffe is offline   Reply With Quote
Old
09-30-2012, 08:56 AM
  #61
Dalton
Registered User
 
Dalton's Avatar
 
Join Date: Aug 2009
Location: Ho Chi Minh City
Country: Vietnam
Posts: 2,096
vCash: 500
Quote:
Originally Posted by Taco MacArthur View Post
Can you tie this statement in with everything? I'm not following.
Not really. It was just a notion I had after re-reading everything.

I would think that since we can't specifically state how many goals a player would score that we might be able to state a range he might score within.

If pressed I would offer that the boundaries would be based on pct of goals scored against everyone and the pct of goals scored based on a very small sample like the top 5%. I don't see this as a proper representaion of probability but it should suffice for a pub conversation.

I think its worth exploring so I mentioned it.

I would expect that using probabilities would be the eventual outcome of the best methods to compare players or gs value for example. 0.001% chance of scoring 100 goals today, 5% of 60 gs, 40% of 50 gs, 80% of 40 goals, 85% of 30gs, 90% of 20gs, 0.001% of 0 goals barring injury. Graphically. Systems of equations. Differential equations. Not pub math lol.


Last edited by Dalton: 09-30-2012 at 11:09 AM.
Dalton is offline   Reply With Quote
Old
09-30-2012, 10:47 AM
  #62
Dalton
Registered User
 
Dalton's Avatar
 
Join Date: Aug 2009
Location: Ho Chi Minh City
Country: Vietnam
Posts: 2,096
vCash: 500
Quote:
Originally Posted by Iain Fyffe View Post
They're for illustrative purposes only. It shows hypothetical raw numbers that form a power-law curve, then the transformation that a traditional adjustment (adjusting points based on the ratio of the league goals per game to some arbitrary number) creates, and also the transformation that a percentage adjustment, as you suggest, would presumably result in.

The point is, you criticize traditional adjusted scoring for being a mean-based analysis, but it's not. Not any more than your percentage adjustment would be, so far as I can tell.


Look at the image on the first page of that study you linked to:



They're arguing that if an analysis of hockey players assumes that performance clusters around a mean (ie, has a normal distribution or a bell curve), then it will be flawed.

Using a mean to make a calculation is not the same as assuming the results cluster around said mean. As my graph was intended to illustrate, adjusted scoring makes no such assumption. The results of adjusted scoring follow a power-law curve, not a normal distribution. Therefore this study is not relevant, because no normal distribution is assumed.

Do you see the difference now?


Or maybe it was a quick illustration, and not a calculation.


This suggests you don't understand the studies you're linking to, or that you don't understand adjusted scoring. Adjusted scoring does not assume that performance clusters around a mean.

If you ran an adjusted scoring study that resulted in a normal distribution of adjusted points, you would have a problem, as the study points out. But adjusted scoring does not result in a normal distribution.


Adjusted scoring does not normalize. It transforms. Normalizing suggests, essentially, regressing scoring totals to the mean, to make the curve closer to a normal distribution. That's not what adjusted scoring does. It applies essentially the same multiplier to everyone's scoring totals. Looks at HR's explanation. Every player gets the same games played adjustment, the same roster size adjustment. The only small difference is that it removes the individual's totals in calculating the league average, which actually means that the better players are adjusted less than other players (on a downward adjustment), just as you suggest they should be.

Read the study, as you say. The study criticizes the tendency to assume a normal distribution, and the habit of making adjustments in order to better fit that assumed curve, such as ignoring outliers. But adjusted scoring does not do this, and thus this criticism is moot.

Also, read the method for adjusted scoring. You're criticizing it for things that it does not do.
You still haven't explicitly stated your methods, equations, variables etc.

So what am I arguing against?

I've clearly stated that if you don't use means then we have no argument. So why are you still defending your mysterious calculations?

Your graph and their graph have different purposes. Theirs clearly illustrates the shape of two different functions. You have a different purpose that isn't clear to me. You need to label axis and add scale, they don't.

The ratio of the league goals per game is a mean that doesn't take the impact of outliers into account. You assume a normal distribution. The players score the goals after all and outliers score most of them. More outliers more gpg league wide. I assume high performing outliers. A preponderance of low scoring outliers could lower the league gpg. That is what might have happened last season. More gs among the top 20% yet a lower league gpg. Your average fails to take this into account. It penalizes the top scorers and rewards everyone else from last season.

Your curve visibly converges towards the raw data and then diverges from it. Can you explain this?

I think its simpler to discard that flawed formula and start anew. I've suggested percentages. Someone more freshly versed in Newton's ways might do better. Percentages have the benefit of more accessibility to an average fan though.

What is with the players per team constant? Clearly adding players to the league would result in a lower gpg. Except for the odd outlier like Errol Thompson, Curtis or St Louis the impact of adding players to the pool would likely be close to 0 goals added or taken away. I doubt that adding 3 players per team would result in an increase of league gs by 1/6. I very much doubt that their playing time would decrease Howe's output by a sixth except for the possible outlier. I don't need a huge non-contested recent theory to critique those calculations.

So we have a formula that presents Howe's gs pace as a constant, an easily contestible extra player constant and a mean (league gpg). At no point does that formula take outliers into account. It's just a normalization.

"For example, Schmidt, Hunter, McKenzie, and Muldrow's (1979) linear homoscedastic model of work productivity “includes the following three assumptions: (a) linearity, (b) equality of variances of conditional distributions, and (c) normality of conditional distributions”"


Last edited by Dalton: 09-30-2012 at 11:29 AM.
Dalton is offline   Reply With Quote
Old
09-30-2012, 12:10 PM
  #63
barneyg
HFBoards Sponsor
 
Join Date: Apr 2007
Posts: 2,370
vCash: 500
Quote:
Originally Posted by Dalton View Post
Your graph and their graph have different purposes. Theirs clearly illustrates the shape of two different functions. You have a different purpose that isn't clear to me. You need to label axis and add scale, they don't.
Let me try.

x = number of goals
y = number of players

Power curve = lots of players have very few goals, few players have lots of them.

You go from raw to adjusted by applying a fixed* multiplier on every player's realized totals. The multiplier takes into account season length, roster size and overall league scoring (i.e. the era adjustment).

That adjustment doesn't assume anything about the distribution. If the true curve is a power curve it'll still be a power curve after the adjustment. If it's a bell curve it will still be a bell curve. Every data point will be a bit closer (or further away) from the y axis depending on the scaling factor. Iain's graph shows what happens when you transform a high-scoring season into a lower-scoring one.

I think what you're trying to say is that a mean-based adjustment will distort the shape of the curve when the raw data follows a power curve. That depends on how you define "distorsion". You seem to like Iain's percentile-based (purple) adjustment better than the traditional (red) adjustment. The difference between the two is an assumption about how a decrease in overall league scoring would be distributed across players. I don't think you can convincingly make a case for one or the other based on the visual appearance of the graph. I think you could rationalize either one as being right.

* The hockey-reference method linked in some post above has a multiplier that slightly varies across players because the league GPG excludes goals scored by the player for which the adjustment is made. It probably doesn't really matter.

barneyg is offline   Reply With Quote
Old
09-30-2012, 04:14 PM
  #64
Iain Fyffe
Hockey fact-checker
 
Iain Fyffe's Avatar
 
Join Date: Feb 2009
Location: Fredericton, NB
Country: Canada
Posts: 3,079
vCash: 500
Quote:
Originally Posted by Dalton View Post
You still haven't explicitly stated your methods, equations, variables etc.
I've linked to HR's adjusted scoring method at least once, and discussed it specifically.

Quote:
Originally Posted by Dalton View Post
I've clearly stated that if you don't use means then we have no argument.
Please show me an adjusted scoring method that does assume the results should cluster around a mean, which is what the study you linked to discusses. You're making a strawman argument. You're suggesting adjusted scoring doesn't work because it normalizes results to the mean. But it doesn't do that, so your criticism is moot.

Quote:
Originally Posted by Dalton View Post
Your graph and their graph have different purposes. Theirs clearly illustrates the shape of two different functions.
Their graph shows a power-law curve, which thay argue fairly represents results of human achievement, and overlay a normal distribution as an illustration of why it is wrong to assume a normal distribution in these cases, because it does not match very well with their ideal curve.

My graph illustrates that adjusted scoring results in a power-law curve, and therefore does not assume a normal distribution. Therefore, your criticism has no basis.

Quote:
Originally Posted by Dalton View Post
You have a different purpose that isn't clear to me. You need to label axis and add scale, they don't.
No scale needed. X-axis is adjusted goals or assists or points, y-axis is the number or proportion of players.

Quote:
Originally Posted by Dalton View Post
The ratio of the league goals per game is a mean that doesn't take the impact of outliers into account.
It does take the impact of outliers into account; it takes the impact of all players into account. It is a mean, but read the study again. The study argues against analyses that assume results cluster around the mean, which adjusted scoring does not do.

Quote:
Originally Posted by Dalton View Post
You assume a normal distribution.
No, I don't, and neither does adjusted scoring.

If you believe this to be the case, please demonstrate it. Plot the adjusted scoring results from HR for a season. See if you get a normal distribution. Compare it to the raw results, see if it's substanially closer to a normal distribution that the raw results are.

Quote:
Originally Posted by Dalton View Post
What is with the players per team constant?
To include a rough adjustment for ice time. Games are 60 minutes long. If you have 15 skaters to allocate ice time to, they will receive more ice time each than if you have 18 skaters. It's to more fairly compare players from eras that used different roster sizes.

Quote:
Originally Posted by Dalton View Post
It's just a normalization.
It appears that you don't understand what normalization means. Adjusted scoring does not change the basic structure of the curve of the results.

Iain Fyffe is offline   Reply With Quote
Old
09-30-2012, 04:16 PM
  #65
Iain Fyffe
Hockey fact-checker
 
Iain Fyffe's Avatar
 
Join Date: Feb 2009
Location: Fredericton, NB
Country: Canada
Posts: 3,079
vCash: 500
Quote:
Originally Posted by barneyg View Post
Let me try.

x = number of goals
y = number of players

Power curve = lots of players have very few goals, few players have lots of them.

You go from raw to adjusted by applying a fixed* multiplier on every player's realized totals. The multiplier takes into account season length, roster size and overall league scoring (i.e. the era adjustment).

That adjustment doesn't assume anything about the distribution. If the true curve is a power curve it'll still be a power curve after the adjustment. If it's a bell curve it will still be a bell curve. Every data point will be a bit closer (or further away) from the y axis depending on the scaling factor. Iain's graph shows what happens when you transform a high-scoring season into a lower-scoring one.
This is it precisely.

Iain Fyffe is offline   Reply With Quote
Old
09-30-2012, 04:55 PM
  #66
Iain Fyffe
Hockey fact-checker
 
Iain Fyffe's Avatar
 
Join Date: Feb 2009
Location: Fredericton, NB
Country: Canada
Posts: 3,079
vCash: 500
Here are the 1952/53 NHL results for actual goals and adjusted goals, using HR's method.



Note the distinct lack of a normal distribution.

Iain Fyffe is offline   Reply With Quote
Old
10-02-2012, 04:22 PM
  #67
seventieslord
Moderator
 
seventieslord's Avatar
 
Join Date: Mar 2006
Location: Regina, SK
Country: Canada
Posts: 26,155
vCash: 500
Quote:
Originally Posted by Dalton View Post
I would think that since we can't specifically state how many goals a player would score that we might be able to state a range he might score within.
Then do it.

If you don't like saying someone had "53 adjusted goals" then determine how confident you are in that number and state it in a range instead. Like "46-60 adjusted goals". No one is stopping you. I will choose to continue to not do that, although I think anyone assumes a certain inherent degree of uncertainty when they view adjusted figures.

seventieslord is offline   Reply With Quote
Old
10-02-2012, 04:26 PM
  #68
TheDevilMadeMe
Global Moderator
 
TheDevilMadeMe's Avatar
 
Join Date: Aug 2006
Location: Brooklyn
Country: United States
Posts: 43,222
vCash: 500
See my post 10 - the explosion of offense in the 1980s was decidedly NOT driven by outliers. In fact, it is the reverse - the top scorers in the league consistently scored a lower percentage of overall goals and points than at any other time. Meaning the explosion of offense was larger driven by lower liners and defensemen.

Other than Gretzky and Lemieux (true outliers in any sense), the point totals of the top scorers in the 80s don't really look different than the point totals of the top scorers in the mid-late 70s.

TheDevilMadeMe is offline   Reply With Quote
Old
10-11-2012, 09:56 AM
  #69
Dalton
Registered User
 
Dalton's Avatar
 
Join Date: Aug 2009
Location: Ho Chi Minh City
Country: Vietnam
Posts: 2,096
vCash: 500
Quote:
Originally Posted by TheDevilMadeMe View Post
See my post 10 - the explosion of offense in the 1980s was decidedly NOT driven by outliers. In fact, it is the reverse - the top scorers in the league consistently scored a lower percentage of overall goals and points than at any other time. Meaning the explosion of offense was larger driven by lower liners and defensemen.

Other than Gretzky and Lemieux (true outliers in any sense), the point totals of the top scorers in the 80s don't really look different than the point totals of the top scorers in the mid-late 70s.
I haven't had the time to vist this thread for awhile. I would remind you that outliers aren't just the top goal scorers. They could also be the worst goalies.

I would also point out that the fact the distribution of goals (or gaa) among the players varies is what shows that using a simple multiplier fails.

How can you say player x scores y goal sin a season based on how many he scored in a different season knowing that the distribution of goals differed in the two seasons?

X scores y goals in a season that the top 10 or 20 % of scorers dominated or perhaps the bottom 10 or 20% of goalies were poor. You can't. You need a more detailed analysis than simply using a multiplier.

I am not saying that using percentages is the answer but it definately reveals the flaws in using a scalar.

Dalton is offline   Reply With Quote
Old
10-11-2012, 09:57 AM
  #70
Dalton
Registered User
 
Dalton's Avatar
 
Join Date: Aug 2009
Location: Ho Chi Minh City
Country: Vietnam
Posts: 2,096
vCash: 500
Quote:
Originally Posted by seventieslord View Post
Then do it.

If you don't like saying someone had "53 adjusted goals" then determine how confident you are in that number and state it in a range instead. Like "46-60 adjusted goals". No one is stopping you. I will choose to continue to not do that, although I think anyone assumes a certain inherent degree of uncertainty when they view adjusted figures.
I have done that a few times already. Read my posts.

Dalton is offline   Reply With Quote
Old
10-11-2012, 10:21 AM
  #71
Dalton
Registered User
 
Dalton's Avatar
 
Join Date: Aug 2009
Location: Ho Chi Minh City
Country: Vietnam
Posts: 2,096
vCash: 500
Quote:
Originally Posted by barneyg View Post
Let me try.

x = number of goals
y = number of players

Power curve = lots of players have very few goals, few players have lots of them.

You go from raw to adjusted by applying a fixed* multiplier on every player's realized totals. The multiplier takes into account season length, roster size and overall league scoring (i.e. the era adjustment).

That adjustment doesn't assume anything about the distribution. If the true curve is a power curve it'll still be a power curve after the adjustment. If it's a bell curve it will still be a bell curve. Every data point will be a bit closer (or further away) from the y axis depending on the scaling factor. Iain's graph shows what happens when you transform a high-scoring season into a lower-scoring one.

I think what you're trying to say is that a mean-based adjustment will distort the shape of the curve when the raw data follows a power curve. That depends on how you define "distorsion". You seem to like Iain's percentile-based (purple) adjustment better than the traditional (red) adjustment. The difference between the two is an assumption about how a decrease in overall league scoring would be distributed across players. I don't think you can convincingly make a case for one or the other based on the visual appearance of the graph. I think you could rationalize either one as being right.

* The hockey-reference method linked in some post above has a multiplier that slightly varies across players because the league GPG excludes goals scored by the player for which the adjustment is made. It probably doesn't really matter.
This is the problem. The distribution clearly varies. Player x scores y goals in a season that his peers scored more or that the goalies were worse. So we say he scores z goals in a different season that his peers scored less or the goalies were better. Clear nonsense.

I am suggesting that we compare how players performed in their peer group rather than just simplifying it to the whole group. Take into account how both ends of the outlier spectrum performed and how the player performed within his peer group.

The player who scored well in a season that his peers scored well might score less in a season that his peers underperformed. Is this concept that difficult?


Last edited by Dalton: 10-11-2012 at 10:31 AM.
Dalton is offline   Reply With Quote
Old
10-11-2012, 12:39 PM
  #72
Dalton
Registered User
 
Dalton's Avatar
 
Join Date: Aug 2009
Location: Ho Chi Minh City
Country: Vietnam
Posts: 2,096
vCash: 500
Quote:
Originally Posted by Iain Fyffe View Post
Here are the 1952/53 NHL results for actual goals and adjusted goals, using HR's method.



Note the distinct lack of a normal distribution.
Once again I see a bell cuve added to a power curve. Note the distinct bump in the green curve around the 24-31 value along the x-axis. The graph clearly moves away from the raw data and then back again. I'd expect that if you removed the tail values you'd have a bell curve. You wouldn't get that with the raw data. IOW as you drop the outliers your adjusted data approaches a bell curve and becomes a bell curve whereas the raw data remains a power curve.

Try it.

You have a translation and a scaling of the original raw data with the impact of the normalization of the adjusted data presenting itself as a divergence away from the raw data at the 24-31 value of the x-axis. Note how similar the curves are all along the x-axis except at that region. Your analysis of these curves is obviously flawed or perhaps unconciously biased towards acceptance of normalization.

I compared Stamkos' 60 to Ovie's 56 and see that 60 has a value in the range of 60-64 in the past while Ovie's 54 has a range of 49-52 in the future season. The values depend on whether one looks at their percentages in the top 5% or 100% of NHL GS.

I looked at a 20 goal season as well. 20 in the past has a range of 20.8-21.4 gs last season. 20 gs last season has a range of 19-18.6 in 2008/09.

It apears that while Ovie scores less compared to his 5% peers and the whole league, a 20 goal scorer scores about the same or a bit more compared to his peers in the 10-20% range or subset.

Stamkos scores more in 2008/09 yet a 20 goal scorer scores less by a goal or 2 compared to his peers.

This is a reflection of the fact that while scoring was down almost 500 goals last season the top outliers scored more of those goals than they did in Ovie's 56 goal season.

Your adjustment formula doesn't even look at this aspect.

I would conclude that while Stamkos scored more in a season his peers scored more it was still a better performance compared to peers than the season that Ovie scored more than his peers in a year that his peers scored fewer of the leagues goals. Despite the fact that the NHL saw quite a bit more goal scoring.

I'ts possible that Stamkos outscores Ovie by more than he outscored Malkin in either season.

What do adjusted stats say?

I think comparing to peers (I use fractions) is much richer and gives a context to interpret. Adjusted stats are are just flat numbers from which the only context is debating the truth of the variables used.


Last edited by Dalton: 10-11-2012 at 12:51 PM.
Dalton is offline   Reply With Quote
Old
10-12-2012, 03:34 PM
  #73
seventieslord
Moderator
 
seventieslord's Avatar
 
Join Date: Mar 2006
Location: Regina, SK
Country: Canada
Posts: 26,155
vCash: 500
Quote:
Originally Posted by Dalton View Post
I have done that a few times already. Read my posts.
You're asking a lot.

seventieslord is offline   Reply With Quote
Old
10-22-2012, 08:05 PM
  #74
Iain Fyffe
Hockey fact-checker
 
Iain Fyffe's Avatar
 
Join Date: Feb 2009
Location: Fredericton, NB
Country: Canada
Posts: 3,079
vCash: 500
Quote:
Originally Posted by Dalton View Post
Once again I see a bell cuve added to a power curve. Note the distinct bump in the green curve around the 24-31 value along the x-axis. The graph clearly moves away from the raw data and then back again. I'd expect that if you removed the tail values you'd have a bell curve. You wouldn't get that with the raw data.
This is a joke, right? It has to be. Please tell me it is.

Look at the red line in my graph. That's the actual number of goals scored. And yet it also has this "bell curve" that you see. Notice that in this case, the adjusted stats actually reduce the apparent bell curve. Which is, of course, not a bell curve at all, but a figment created by a relatively small sample size.

Please, please, please explain how the adjusted scoring method used by HR "adds a bell curve to a power curve". Demonstrate how it does that. You keep saying it, without showing how. Tell us. Stop asserting and start proving.

If you see a bell curve, it's apparently because you want to see a bell curve. You accuse my analysis of being biased, but your bias is showing in great big neon letters.

Quote:
Originally Posted by Dalton View Post
IOW as you drop the outliers your adjusted data approaches a bell curve and becomes a bell curve whereas the raw data remains a power curve.
If you drop a very large number of observations at the bottom and a few at the top, yes you probably could as it happens for this set of data. But of course, this would also happen if you did the same to the raw data. As such it has nothing to do with the adjusted scoring method, and would merely be a reflection of intentionally-misleading data manipulation.

If you define "outlier" broadly enough, you can turn any power curve into a bell curve. But that's disingenuous. It's not what "outlier" means. If you think that you can remove a few outliers and transform my graph into a normal distribution, I fear you don't know what outlier means. If you consider the mode of a population (the most frequent observation) to be an outlier, you're going to get wonky results.

A definition of an outlier provided by a statistician is "An outlying observation...is one that appears to deviate markedly from other members of the sample in which it occurs." (Emphasis added). It does not mean a "tail value". That is, you can't just remove the zero values in my graph as outliers, because they clearly do not deviate markedly from other members of the sample - they are in fact the most common member of the sample. The bottom and top values are not automatically outliers.

As such, you assertion has no merit.

I think we're done here.

Iain Fyffe is offline   Reply With Quote
Old
10-22-2012, 08:09 PM
  #75
Iain Fyffe
Hockey fact-checker
 
Iain Fyffe's Avatar
 
Join Date: Feb 2009
Location: Fredericton, NB
Country: Canada
Posts: 3,079
vCash: 500
Quote:
Originally Posted by Dalton View Post
This is the problem. The distribution clearly varies. Player x scores y goals in a season that his peers scored more or that the goalies were worse. So we say he scores z goals in a different season that his peers scored less or the goalies were better. Clear nonsense.
That would be nonsense, if that's what adjusted scoring said. But it's not.

Quote:
Originally Posted by Dalton View Post
I am suggesting that we compare how players performed in their peer group rather than just simplifying it to the whole group. Take into account how both ends of the outlier spectrum performed and how the player performed within his peer group.
Then do it already, and be prepared to demonstrate that the results are "better" than adjusted scoring. I'd wager they're simply not going to be much different.

Iain Fyffe is offline   Reply With Quote
Reply

Forum Jump


Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -5. The time now is 05:46 AM.

monitoring_string = "e4251c93e2ba248d29da988d93bf5144"
Contact Us - HFBoards - Archive - Privacy Statement - Terms of Use - Advertise - Top - AdChoices

vBulletin Copyright ©2000 - 2015, Jelsoft Enterprises Ltd.
HFBoards.com is a property of CraveOnline Media, LLC, an Evolve Media, LLC company. 2015 All Rights Reserved.