View Single Post
11-05-2012, 10:27 PM
#214
Registered User

Join Date: Jan 2006
Location: bohemia
Country:
Posts: 4,846
vCash: 500
Quote:
 Originally Posted by Czech Your Math Using all 4 variables, the R-squared was 99.8% and the values for each X were as follows: Xn= 1.05 Xg= 6.77 Xe= 16.4 Xp= 49.4 Using 3 variables (Xn excluded), the R-squared was 99.7% and the values of each X were as follows: Xg= 7.83 Xe= 39.5 Xp= 92.8
Let me just explain this process to those unfamiliar with it. Each model calculates coefficient values for each variable, which together produce the least total error (actually it's a sum of the square of each error). The equation for the first (4 variable) model is:

Y = 1.05*Xn + 6.77*Xg + 16.4*Xe + 49.4*Xp

or

Avg. Adj. Pts. of Top N Players = (1.05 * # Teams) + (6.77 * League Avg. GPG) + (16.4 * Ratio of non-Canadian Top N to Total Top N) + (49.4 * Ratio of PP & SH Goals to Total Goals)

In the second (3 variable) model, the coefficient values of Xe and Xp increase dramatically, as a result of Xn being excluded. This is because in the first model, Xn captured a lot of the effect present in Xe and Xp. IOW, in most of the same seasons where there was an increase in teams (due to expansion), there was also an increased representation by Euro/US players in the top N scorers, and an increased number of PP opportunities. I would think that a lot of the effects causes by increased non-Canadian players and increased PP opportunities was mistakenly attributed to the increase in the number of teams, because they each had increased values in most of the same seasons.

Here's what the second model would predict:

A) For each .10 increase in league gpg, a .78 increase in avg. adj. pts. of top N scorers

B) For each 10 percentage point increase in top N forwards which were non-Canadian (e.g. from 20% to 30%), a 3.95 increase in avg. adj. pts. of top N scorers

C) For each 1.0 percentage point increase (e.g. from 22% to 23%) in PP/SH goals as a % of total goals, a .49 increase in the avg. adj. pts. of top N scorers

There may be some small rounding or other errors present in each of the variables, but these shouldn't significantly affect the results. There are some alternative models that could be studied, but I would guess the most interesting modifications would be to the quality of the Y variable (using different quality or quantity of tiers), rather than to the X variables (I can't think of many other important X variables, except maybe a variable that measures parity between teams and/or players).