Adjusted stats - how valuable?
View Single Post
09-23-2012, 02:24 AM
Join Date: Apr 2007
Location: Black Ruthenia
Originally Posted by
Hopefully this forum sheds some light on what is being claimed and what isn't. It is not all easy to figure out what some of these adjustment methods are saying or even what methods you are referring to. Perhaps you need to include examples instead of talking in such sweeping terms.
It is or should be clear that simply using league gpg to compare players results over eras is in no way accurate. That is the method I'm debating.
I don't agree that one can state with any confidence that x goals in one era is equivalent to y goals in another. I think it's more accurate to look at players in diminishing subsets of their peers while looking at the performance of each subset within the whole set. My 'number' is actually a curve.
Let me try another example. I've said this before and I'd remind readers that I am not the best one to argue this POV. LOL
I've dl'd 2010-11 stats from NHL.com. Just skaters and all skaters. I'll compare to TSN's 2011-2012 data. I have to assume the data is accurate. I'll risk that my calculations are accurate. I'm 0 for 2 so far when posting spreadsheet results.
The league scored more goals in 2010-11 than last season. 6721 to 6542. Each season had a 50 goal scorer. Using means one would be tempted to lower the value of Perry's 50 since the gpg was higher in 2010-11.
But looking closer we can see that the top 5, 10 and 20% of skaters last season scored more than the same groups of the 2010-11 season. The top 5% of 2010-11 scored 21.5% of the league's goals. The 2011-12 group scored 23%. The Top 10% of skaters are 39.6% to 36.8% and the top 20% of skaters 63.6% to 60%. Both in favour of the 2011-12 season. Stamkos' 60 goals are not responsible but obviously contributed. However I notice that the difference increases as the groups get larger.
So while scoring went down over all last season the top 20% of skaters in 2011-12 actually out produced the top 20% of 2010-11 skaters. In this context the value of Perry's gs is higher than the value of Malkin's. It seems it was easier to score for top 20% of skaters last season rather than the previous season as gpg suggests. In fact Perry's 50 represent .0348 of the top 5%, then .0202 and .0124. Malkins 50 represent .0327, .0193 and .0120.
To use the mean to adjust 2010-11 downward would not acurately reflect what happened. Perry scored 50 in a season in which his peers scored less overall compared to Malkin who scored his 50 in a season that all his peers scored more. The value of Perry's 50 should be more than Malkin's not less.
I would argue that any formula that uses means ignores the effect of outliers and gives unreliable results compared to calculations that take outliers into account. If you are not talking about using means then I'm not sure how to respond to you since that is what I'm talking about.
The fact that Perry didn't score more this year simply reflects the value of taking these comparisons seriously. Scoring x goals in a season really has no bearing on what the player would score in another season or era. No matter how the data is presented. To state that gx goals in season sx is comparable to gy goals in season gy just isn't accurate. To say that player px in season sx compared similarily to his peers as player py in season sy has more meaning IMHO. In my example Perry performed at a higher level compared to his peers than Malkin. It appears that it was somewhat easier to score 50 goals last year then it was in the previous season. For what it's worth using percentages Perry would score 48.6 goals last season according the percentage of all the league's goals that he scored or maybe 53 just looking at his percentage of the top 5% of all skaters. Almost 5 goals difference. Adjusting raw data just doesn't work IMHO.
I should also point out that for about 70% of the players using a mean would not be acurate because they scored less as a subgroup of the league last year than the previous season. Applying means to adjust 2010-11 players to the lower scoring season of 2011-12 would not be accurate. Their results would be inflated compared to the 2011-12 reality just as Perry's results would be deflated. Means just don't work IMHO.
Is there any rational basis for why it would be easier for elite players to score in 2011-12 (as compared to 2010-11), and yet more difficult for players as a whole?
If not, it's probably just randomness.
Last edited by Master_Of_Districts: 09-23-2012 at
View Public Profile
Find More Posts by Master_Of_Districts