Using Regression to Adjust "Adjusted Points" for Top Tier Players '68-12
View Single Post
11-16-2012, 02:13 AM
Czech Your Math
Join Date: Jan 2006
I ran another regression from 1976-2012 with y-intercept (B) and same coefficients (Mn... Mg for Xn... Xg):
Mn = (0.75)
Me = 7.15
Mg = (1.65)
Mp = 61.8
R^2 = 0.32
All variables appear significant, with the lowest t-stat that for Xg at ~ 9 (N=36, Mg/SEg ~1.5).
There is a large cross-correlation between many of the variables:
Xn & Xe = 88% ... did NHL expand in response to Euro influx? I think this is largely coincidental.
Xn & Xg = (83%) ... did expansion make goal scoring decrease? This wouldn't be the expected observation IMO.
Xn & Xp = 45% ... I don't see a logical relationship between Euro influx and increased power plays, but it's possible.
Xe & Xg = (79%) ... did Euro influx cause goal scoring to decrease? Considering Euros were disproportionately scoring forwards, this seems odd, although talent compression tends to decrease scoring IMO.
Xe & Xp = 50% ... did Euro influx cause an increase in power plays? I don't see why, esp. as it contradicts other correlation(s).
Xg & Xp = (18%)... not much of a correlation, but why would power plays and goal scoring be negatively correlated?
BTW, in case I wasn't clear before, Y = avg. of top N players'
points. This means Y is per 82 games, adjusted to 6.00 gpg league avg., and assist/goal ratio of 5/3.
My main concerns with this model ares that it appears there may be important variables missing (given the low R^2) and that the variables are mostly cross-correlated. I don't think roster size is an issue during this period. What other variables may be missing? I think the cross-correlation of variables is largely coincidental, but can the variables be better defined to prevent this?
I think one factor that's not totally included in the variables is the quality of talent in the league. This is captured somewhat by % of non-Canadian forwards in top 1N (Xe), but doesn't tell you if a lot of great talent is in its peak/prime. For instance, the largest error between the predicted value and actual value is in 1996 when predicted is lower. One only has to look at the top 20+ players in scoring to see that season was full of quality forwards (although power plays increased substantially as well). Another possible variable is something for league parity (standard deviation of GF/GA ratio or GF only or GA only?).
Last edited by Czech Your Math: 11-16-2012 at
Czech Your Math
View Public Profile
Czech Your Math's albums
Find More Posts by Czech Your Math