Improving Adjusted Scoring and Comparing Scoring of Top Tier Players Across Eras
I've done a study of a large, but fixed, group of players over several decades.
GOAL: To determine which seasons were most and least difficult for top line forwards and top offensive defensemen to score points.
METHODOLOGY: The concept of the study is simple. It's essentially similar to the method others used in developing "league equivalencies", which has been done for the NHL vs. WHA, NHL vs. minor leagues, and NHL vs. foreign leagues, for various periods. So it's considering "Year X NHL" and "Year X+1 NHL" to basically be different leagues, then examining how players who participated in both performed. It does not necessarily have to be restricted to a certain group of with minimum quality threshold, but some possible reasons for restricting it to higher scoring forwards and defensemen include:
- Such players may be expected to have longer careers, so for each player included there will be more total "player-year"s in each study (decreasing the statistical error in the study).
- Such players produce at a higher level, which is less influenced by random error (variation).
- Such players often have less variation in opportunity (ice time, PP time, etc.), so there results should be less influenced by this.
- Such players are most frequently the subject of comparison. Since the results are derived from such players, the results should be especially applicable to such players, and so particularly useful.
I have continued to add players to the study. The number of players in each season pair (which will usually be less than the number of players in the fixed group which were active at that time):
- From the '47-8 pair of seasons to the '67-8 pair of seasons, it's 32-45 players and an average of ~37 players per pair of seasons. That's an average of about 6 players per team.
- For the '68-9 seasons to the '71-2 seasons, it's 54-62 players and an average of ~53 players or ~4.5 players per team.
- For the '72-3 seasons to '81-82 seasons, it's 72-82 players and an average of ~78 players per pair of seasons or ~4.2 players per team.
- For the '82-83 to '03-4 seasons, it was 79-91 players and an average of ~84 players per pair of seasons or ~3.4 players per team.
Each pair of consecutive seasons was examined separately, the results sorted by % change in adjusted PPG. I looked at several ways to measure the effect, and the results differ significantly depending on which metric is used. I used the median half of players in terms of % change in PPG, since it both includes a full half of the players participating in each season pair and still removes many outliers in both directions that can be caused by irrelevant factors or random error. I believe using the median (middle) half of players to be a good method for obtaining reliable results. This discards the top 1/4 and bottom 1/4 of players in terms of % change in PPG, so that factors such as injury, change in opportunity, change in team or linemates, improvement or decline due to age, etc. are prevented from substantially affecting the results in a harmful fashion.
The method of linking each "year over year" to produce numbers which may be used to compare across longer spans of time is relatively simple math.
Here's what the effective league GPG was for top tiers of players, based on the results of this study:
Example of how results were calculated: 1966-7 & 1967-8 seasons
Before I show the results of a pair of consecutive seasons, I want to mention a couple things about the data I used. I calculated adjusted PPG differently than other sources may, such as HR.com. There is no adjustment for roster sizes, nor are players' individual stats deducted from league totals beforehand. I used each season's gpg and assist-per-goal ratios in the calculations. I normalized to a fixed gpg (the number chosen is irrelevant, but it was 8.00) and used an apg ratio of 5/3 or 1.667 (this affects goal-scorers and playmakers slightly differently). If someone was to repeat a study similar to this, it's probably easier to use "raw" PPG, but this shouldn't change the results significantly.
So let's look at one hypothetical player: Joe has an adjusted PPG of 1.00 in season Y and 1.05 in season Y+1. So his adjusted PPG increased by 5.0% from season Y to season Y+1. This +5.0% (or +.05) is the basis of all further calculation in the study, once it was determined that an average of % change in PPG of the median half of players (in terms of % change in PPG for the pair of seasons) appeared to be the most reliable metric I had found thus far.
Also, I did delete a season or two for many players. I did this as sparingly as possible, since deleting one season resulted in losing data for two pairs of seasons (e.g. if we deleted Joe's 1957 season, we lose his data for '56 vs. '57 as well as '57 vs. '58). In most instances, the seasons deleted were at the very beginning and/or very end of the player's career, when it seemed very obvious that:
- the player was the not yet near his prime and/or not getting full playing time
- the player was quite past his prime and no longer getting full playing time or often injured
- was injured in the middle of his career and this very dramatically affected his PPG
Most players had all of their full seasons included, while some had one or two omitted from the beginning and/or end of their careers.
Here's the 44 players for the season pair 1966-7 & 1967-8, sorted by % change in adjusted PPG, and only showing adjusted PPG for each season and % change in adjusted PPG:
Calculation of % change in adjusted PPG for the median half of players was done as follows:
The median half in this case are players ranked 12 through 33 in terms of % change in adjusted PPG for this pair of seasons. In the previous post, these are the players starting with Esposito (+38%) and ending with Nesterenko (+1%). Once the median half was selected, the calculation is rather simple. Add the 22 percentages (or decimals, such as +0.38.... to +.01), then divide by 22 to get an average.
In this case, the result is .146 or +14.6%.
What does this result suggest? It suggests if a first liner or top producing d-man scored at 1.00 adjusted PPG in '67, a decent approximation of his expected adjusted PPG in '68 would be 1.146.
The number 1.146 would then be used for the year 1968 (although it's actually a pair of seasons, '67 & '68) in comparison to 1967.
So, using this method, a % change number for each pair of seasons from '46 & '47 until '06 & '07 was produced. Here is the list of the average % change in adjusted PPG from one season to the next (season listed is the second in the pair):
As you can see, 1968 shows 14.6%, meaning a +14.6% expected change in adjusted PPG from '67 to '68. So we have some approximation of adjusted PPG of top players can be expected change from one season to the next (or from the season prior), but what about seasons that are further apart? This requires further calculation.
First, we need to use to convert 14.6% into a more useful number, first by using its decimal form of .146, then by adding 1.000 to yield 1.146.
So how do we compare seasons that aren't consecutive? Let's start from the beginning to make it simpler. Our first season in the study is 1946. Since there is no season before this in the study, for now it's our baseline season and is assigned an index number 1.000.
To get an index number for 1947:
1947's % change number is -3.7% (or -.037 in decimal)
add 1.000 to get 0.963
multiply the 1946 index number (1.000) by .963 which is .963
Now we have index numbers of 1.000 for '46 and .963 for '47
To get an index number for 1948:
1948's % change number is -8.4% (-.084)
add 1.00 to get .916
multiply the 1947 index number of .963 by .916 = .881
So what does this suggest? It suggests a top line player with an adjusted PPG of 1.00 in 1946 could be expected to score at approximately a 0.88 PPG pace in 1948.
This process is repeated until an index number for each year is calcualted. These numbers are as follows:
The index numbers from the previous post can be used in the form presented, but I wanted to offer an alternative format. I summed and averaged each of the 61 index numbers, then normalized the average to 1.00, meaning that the average index number for the period studied is 1.00. Therefore, a revised index number above 1.00 means it was easier than an average year in the study to produce adjusted points, while a revised index number below 1.00 means it was tougher than average to produce adjusted points.
Here are the revised index numbers for the period '46 to '07:
Awesome, awesome stuff here.
If you multiply the index numbers averaged to 1.00 by the league gpg, this is how it looks:
One could use these numbers to compare and adjust raw points from season to season.
with a fluctuating assist per goal ratio by season, how can there just be a factor that one can multiply points by? It sounds like there needs to be two components to this.
AFAIK, for the past 55 years, the apg has been between 1.62 and 1.75. I use 1.667 (5/3) apg ratio as a rough average and to be consistent (it's easy to remember).
It's another reason why adjusted numbers of any kind are not nearly exact or some sort of gospel... only the best approximation available.
I like it and I think it provides a batter base to start from.
The thing that I agree with the most and why I like what you're doing here is that after you have done your calculations, you are using common sense and real player comparisons to "back check" your results.
The absence of common sense is by far the biggest issue with any adjusted stats system, not the systems themselves.
i dont understand why some (correctly) point out all these flaws in adjusted stats while not noting that many (most?) of the criticisms apply to stats in general...
I'm confident enough in the methodology to believe that's a definite improvement over simple adjusted stats.
Thanks though, I will post another alternative or two, and anyone is free to back-test, comment, question or attempt to replicate similar results.
If someone wishes to do a study with similar goals and methodolgoy, then I would suggest the following possible improvements:
A) Using a strict definition for including and excluding players in the study. This could include some combination of various standards, such as seasonal rankings (ranking at least N times in the top X players for a season or for a period of Y years or during a career) and absolute performance (scoring at least Z points in at least N years, or at least Q points over a career). One could use various standards such as points, PPG, games, etc. It's not an easy task, given the changes in league size, league talent pool, league scoring, and other dynamics, all of which may affect the size and composition of the group being studied. Also, as can be seen in the thread for this study, there is the question as to whether the number of players should be held roughly constant, or increase in rough proportion to the size of the league. At the point when I stopped the study, it was more of a compromise between the two. I can see why, as was later suggested, one would want to keep it roughly proportional to league size (and therefore to opportunity), but this also likely significantly changes the composition of the group in terms of absolute (average/median) quality. It seems it might be best to use criteria that would increase the number of players as league size increases (but not necessarily in exact proportion), and that would also keep the (median) quality of players relatively static. Considering that either or both opportunity for and absolute quality of players are affected by such factors as expansions, mergers, competing leagues, newly available or expanding non-Canadian talent pools, population growth, etc., it's not exactly a simple task to do so. Perhaps more than one study is needed, such as one keeping opportunity constant, and one attempting to keep median quality of players constant.
B) Using a strict definition for including or excluding each players' individual seasons in the study. One could use every season, and I frequently did this, but also often eliminated seasons from the study without an exact definition. A minimum number of games might be best, possibly a minimum ranking or performance level, or some combination. One must realize that to include every season means including seasons with minimal games played, seasons when opportunity was obviously limited, seasons when a player is extremely young/old by hockey standards, seasons when the player was obviously injured, etc. OTOH, one should realize that to exclude player seasons may also unintentionally bias the sample in some way and so should not be done hastily (I generally included seasons when in doubt). As previously stated, one advantage of using the median half for the relevant calculations was that larger fluctuations in performance (evenly divided by direction) were filtered out as outliers.
C) I used adjusted PPG as my metric (and then the % changes in such), which was already adjusted for schedule length, league GPG and the assist/goal ratio. However, it would probably be better to use actual PPG, although the effect should be minimal (it all comes out in the wash basically, since there is a large group of player sbeing studied).
D) I used median half of players in terms of % change, after initially favoring the (probably too narrow) median third. One could use a different arbitrary median (such as 2/3, 3/4, 60% or whatever), but I don't really know what the most reliable and proper number would be. I do think using some such median is crucial in eliminating outliers arising from mostly irrelevant factors, as well as simply random error.
Finally, while an improved "duplicate" study of sorts is certainly encouraged, I also proposed an alternative way to study this using multivariable (linear?) regression analysis. One could use PPG as the dependent variable and independent binary variables such as Player A, Player B, Player C,..., Year X, Year X+1, Year X+2,..., Age N, Age N+1, Age N+2,...., etc. The 1.00 value for age could also be split amongst two consecutive variables (i.e. if player is age 25 years 6 months 0 days on the standard date used, use 0.5 for age 25 and 0.5 for age 26).
Very much to digest, I may give it a try later tonight.
Always been interested about Mario Lemieux' comeback season of 00-01.
He recorded 35 goals in just 43 games with 76 points.
Bure lead the league with 59 goals, and Sakic had 54.
Meanwhile the Ross winner was Jagr with 121.
Considering Lemieux was 35 years old.
You're not the only who thinks there was something funny about the 2001 season though. This study suggests that it was about 5.5% easier to score adjusted points in 2001 than in 2000, so it really seems to capture the effect.
Here are the biggest increases in expected adjusted PPG from the previous season:
No surprise there. The three most recent increases are due at least in part to increased power plays. It may be a bit surprising that 2006 isn't on there, but the league gpg increased substantially, so it was already reflected in that statistic. Also, the first couple and last couple of seasons in the study are probably less reliable due to the number of players being slightly less.
Here are the largest decreases in expected adjusted PPG from the previous season since expansion:
There's a hangover effect in 1997 and 2002, as there were large increases the previous seasons, but power plays declined. There was a contraction by one team in 1979 and then the subsequent WHA merger in 1980.
The discussion between you and Big Phil about Turgeon and Savard mainly comes down to one simple thing:
Phil sides with his senses, the opinions of others, and using the "curve" (ranking amongst ones peers) when making an evaluation.
You seemed to side with the scientific method (logic, math, statistics, data analysis)
I think you know which approach I favor. That doesn't mean they both don't have some validity and that they each have room for error.
The main problem is that the senses and memory can lie. An unbiased eyewitness can be certain that he/she witnessed something, yet be dead wrong. It happens all the time. Memory only gets worse with time and people often hold on to their opinions more staunchly than ever in face of the facts.
I did this study, because it was the best I could manage to attempt to remove all the extranneous factors that effect league scoring and ranking amongst differing peer groups. It takes actual production of a fixed group of top tier players across time and quantifies it. That is why it's an improvement upon exisiting adjusted data.
There are things the data can never capture, but the goal is to approach the limit of what the data can tell us and use it as an objective starting point for further discussion.
I know there's at least a handful of posters on this forum who actually can understand both the methodology and the implications of this study, because they have done some fantastic quantitative studies themselves. I hope some of them take the time to read the study and give some feedback as to the methodology and results. Better yet, someone with the proper database and code-writing skills could replicate a similar study in a relatively short time to either affirm or disprove my results. In the absence of such, I stand by the results as a significant improvement over existing adjusted data.
I think this work and others similar is very important for fans with interest in using math to study hockey. I am very happy that the OP outlined his methods.
But we still need more context to truly capture this. I feel we are in a Ptolemeic age just using numbers and a flawed belief (that past seasons must be adjusted downward). What would these calculations look like if the belief was that the 90's were a golden age and everything revolved around them? Of course this would be unacceptable to most fans of the day to see Ovie and Crosby adjusted downwards.
What's missing is true mathematical modelling. Like what's done for the weather. Maybe we need to take a step similar to the Drake equation. Try to state or isolate what goes into a players season and whether it's directly or indirectly proportional.
Talent *injury*team talent*coach*competition* many factors concerning talent pool. Players per pop, drafting. So many factors and I'm sure more could be added or removed to achieve simplification with high accuracy.
I think something similar to a Drake equation and then people like the OP filling in the details with their analysis.
Until then this type of data manipulation just serves the fan base that thinks the heroes of today are better than the heroes of yore. Giroux is better than Gretzky type arguments. This stuff needs an asterisk and a safe place to keep it until the modelling reflects that the Earth goes around the sun. Talent not numbers.
We do not know which era was the most talented, Which era should be the center that all other numbers revolve around. Today it's just a moving target. The belief that talent is more prevalent per player than it was in any other era except next years. Eventually Gretzky reduces to a 30 goal season.
Maybe we just need a talent quotient. A numerical estimation based simply on how much better a player was than his peers. A player or profile that is 100. Maybe we need a set of numbers according to different talents.
Coffey was a great skater was he the Newton of skaters? Was he a 196 Skating IQ? Has anyone been better compared to their peers? Just an example. I can't say Coffey was the best skater compared to his peers than any other qualified player in history. Orr was probably higher.
Stats are just a reflection of the player's talent in a context not analyzed with similar methods. Many of these adjustments just don't pass the eyeball test.
Just MHO. I'm sure others could state it much better. There's a a lot of IQ on these boards.
Very interesting. I think the year-to-year results are particularly good. You can see the short-term changes in scoring conditions, especially in seasons where more power plays were awarded.
I'm not sold on the usefulness of this metric over longer time periods. If there's anything it's missing, at all, the error will build up over time.
First, this analysis only considers offensive production. Suppose that NHL players peak offensively earlier in their career than they peak defensively (where defensive play includes non-scoring factors such as strength of opposition and zone starts as well as actual defensive play.) Suppose also that NHL players receive playing time based on the sum of their offensive and defensive contributions. If these suppositions were fact, you would see NHL scorers generally tending to score fewer points than the previous season, but it would be a result of the natural aging curve of NHL scoring talent. It would not necessarily mean that it was becoming more difficult to score over time.
I also wonder how much your subjective choices affected the results. Ideally I would rather use an objective metric like estimated ice time (post-expansion only) to choose whether to include seasons or not. I realize that could be more difficult, depending on the data you have available. But generally speaking I think looking at usage rather than results is a good way to avoid "cherry-picking" successful results.
It's extremely difficult to separate the aging curve from the change in league talent level in this type of study. I don't think your study is flawed so much as I doubt whether one can ever put a lot of confidence in the results of a study that chains year-to-year scoring changes over decades, due to the difficulty of removing the aging curve. It might be worth running some numbers to test whether NHL players actually do have their offensive and defensive peaks at different ages on average. Something like an aging curve for PP time vs SH time, or for points vs qualcomp + zone start. While I haven't run the numbers on this, I believe goal scoring tends to peak earlier than playmaking among NHL players, so I think it's very possible that defensive contribution peaks at a different age as well.
I do think it has much usefulness over longer periods of time, because the effects are generally much larger over longer periods. If the effects were very minimal, then the uncertainty error inherent in any such study would overwhelm the possible usefulness of the data. I don't believe this is the case here, although I am unable to quantify this myself.
Otherwise, I think the age effect should be minimal, since each pair of seasons are examined separately.
I don't believe I "cherry-picked" results, although I may be misunderstanding your use of this phrase. I used players' entire careers, but eliminated some seasons for some players, when it seemed clear that:
- the player was not yet receiving close to full ice time or perhaps not even close to his prime level (e.g. ppg's from start of career are .50 .60 .90 1.00 .95.... eliminated first two as not reliable)
- the player seemed to have fallen off dramatically due to injury and/or age and was far past his prime (e.g. ppg's at end of career are 1.10 1.05 .95 .55 .60 .40... elminated last three as not reliable)
- the player appeared to have a major injury in the middle of his career (e.g. ppg's are ... 1.10 1.15 1.05 .60 1.00 1.10 1.05... eliminated .60 as unreliable)
When in doubt, I left the data in, since any outliers would be filtered out by only using the median half or third of players, so large % changes in PPG would not affect the results much.
Perhaps I was not rigorous enough in balancing the age component, but believe me that this issue was given heavy consideration while doing this study.
Also, I don't see how anyone can deny that there is ever-increasing talent per player over longer periods of time. However, this was a study of top tier players, not just the average.
I actually thought there may be an even fairer way to do this, but I couldn't get it to work properly in Excel. This also relates to Overpass's concerns about age influencing the results.
If you use multi-variable linear regression and use independent variables such as player, age, season, etc., with points being the dependent variable, it seems to me like the results might be even more reliable, although I'm not certain of this.
It seems that it becomes tougher for top tier players to score adjusted points when their is a compression of talent in the league, particularly toward the top.
It's easier immediately after WWII, then becomes a bit tougher as the depleted talent is replaced or returns. As population increases (and perhaps hockey becomes more popular?) it becomes increasinly difficult in the mid-60's. Then expansion dilutes talent (esp. due to lack of parity) and it becomes much easier. The WHA siphons off talent and along with continued expansion, makes the '70s a much easier era in this regard. The WHA merger and lack of expansion for many years makes the '80s a very tough era for adjusted points. Then the re-emergence of expansion makes it not quite as tough, but his effect if mitigated by the addition of top tier talent from overseas during the '90s. Since there hasn't been any expansion for many years, it's become tougher again.
I noted before how it became easier when there was a dramatic increase in PP opportunities. The exception seems to be 2006. One reason for this may be the lockout. Many players near the end of their careers may have chosen retirement at this time (Messier, Hull, Francis type players), while younger players were deprived of a season which might have helped them develop further. This may have negated the expected increase in adjusted PPG for the first line players.
For each age, I looked up the number of player-seasons by forwards from 1998-99 through 2011-12 with 1400+ minutes played. Then I repeated this using 4.5 Offensive Point Shares as the benchmark. Here are the results.
The aging curves are similar, but offensive production trends ahead of ice time in the ages 20-24, and ice time ahead of offensive production starting in the late 20s. So either coaches should be giving young players more playing time (probably what a lot of people on this site would say;)), or time on ice is a good proxy for value and offensive production peaks earlier in a player's career than overall value .
Anyway, it strikes me that this may not be the main cause for bias. When looking at the long-term changes, the net effect of each player is simply the difference between his first season and his final season, correct? So it is actually very important which seasons you choose to start and end with for each player when looking at the long term trend, although not so much for individual seasons. Maybe you are satisfied that you have made good choices for each player about which seasons to select, but it's hard to evaluate the result when it's entirely built on subjective choices.
I don't think this significantly harmed the results, although it would always be better to have an objective standard.
The bolded part about the net effect being the difference between the first and last seasons for each player included in the study is not correct. All that matters are the % change from one season to the next. For almost all players, not all of their included seasons directly influence the calculations. That is why I used metrics such as the middle third and middle half in terms of % change. If a player's season N is excluded, it only affects the results for N-1 vs. N and N vs. N+1, and that's only if the player played the seasons before and after season N. If the change for that player from seasons N-1 to N or seasons N to N+1 were atypical compared to other players, it would have been filtered out as an outlier by taking the median third/half of players. If it wasn't atypical, then the results wouldn't have changed much with its inclusion.
Perhaps a very simplified example will illustrate this:
Bob 1.00, 1.10, .99, .99, 1.19
Dave 1.00, .90, .99, 1.18, 1.18
Joe 1.00, .80, .80, 1.00, 1.10
Steve 1.00, .90, .81, .81, .89
Tom 1.00, 1.00, 1.10, .99, .99
If you look at their changes from first to last:
So an average change from year1 to year5 is +7%
That is not how the results were calculated however. An analog to the methodology is to take the median 60% in this case (I used median 1/3 and median 1/2), IOW the middle 3 players in terms of % change in PPG.
There % PPG changes were:
Bob +10, -10, 0, +20
Dave -10, +10, +20, 0
Joe -20, 0, +25, +10
Steve -10, -10, 0, +10
Tom 0, +10, -10, 0
So for seasons 1-2 the % changes are: +10, 0, -10, -10, -20
The middle 3 are 0, -10, -10. An average of these is -6.67%
For seasons 2-3 the % changes are: +10, +10, 0, -10, -10
The middle 3 are +10, 0, -10 for an average of 0.
For seasons 3-4 the % changes are: +25, +20, 0, 0, -10
The middle 3 are +20, 0, 0 for an average of +6.67%
For seasons 4-5 the % changes are: +20, +10, +10, 0, 0
The middle 3 are +10, +10, 0 for an average of 6.67%
So season 1 is our baseline of 100
Season 2 is (1-.0667)*100 = 93.33
Season 3 is (1-0)*93.33 = 93.33
Season 4 is (1+.0667)*93.33 = 99.59
Season 5 is (1+.0667)*99.59 = 106.26
So while the +7% average change from season 1 to 5 is highly correlated to the result, it is not the same as the +6.3% figure that was calculated using a method analagous to mine.
When using the middle third or half in terms of % change, among the top 30-75 players which played in both seasons, most or all of the outliers are removed, but there is still a substantial sub-group left in the middle, whose change in production is used as the best approximation for that pair of seasons. There won't be such a high correlation between the result and a simple average of all the participating players' % changes from season X to season X+?.
I'm certain improvements could be made in different parts of the methodology, such as the means of selecting which players and which of their seasons to include. Realize also that any assumptions are going to influence the results. If you use a standard such as "any player who finished in the top X in at least Y seasons", then you'll probably end up with more weaker players from eras where the competition was less intense, while excluding some stronger players who didn't fit the criteria due to competition. Basically, it's a lot more difficult than it sounds to come up with strictly objective criteria free of bias.
|All times are GMT -5. The time now is 11:06 AM.|
vBulletin Copyright ©2000 - 2015, Jelsoft Enterprises Ltd.
HFBoards.com, A property of CraveOnline, a division of AtomicOnline LLC ©2009 CraveOnline Media, LLC. All Rights Reserved.