Using Regression to Adjust "Adjusted Points" for Top Tier Players '68-12
View Single Post
11-19-2012, 04:58 PM
Join Date: Apr 2007
Originally Posted by
Czech Your Math
Thanks, this is very insightful. You are right, the variables within each category are not independent, since only one dummy variable in each category will have value = 1, and the rest 0, and so always sum to 1.
If I understand you correctly, you suggest that eliminating one dummy variable in each category will solve that problem, but I don't see how that would change the fact that the variables within each category are not independent. BTW, in the simplified (small scale) model, I still had 3+ observations for each dummy variable. IOW, there were at least 3 observations at each age, at least 3 for each season, and at least 3 for each player.
It sounds to me like this type of model just isn't possible.
It's definitely possible -- the only reason you aren't getting a solution is that you fell into the
dummy variable trap
. If you remove the 2011 dummy, none of the remaining dummies can be expressed as a linear combination of the others because it's no longer true that the sum of all dummies is equal to 1 for each observation (as the sum will be 0 for the 2011 observations).
"Linear independence" is quite a bit more permissive than "independence" as usually defined in probability theory. You can have heavily correlated variables -- this creates other problems, but they will still be considered linearly independent as long as one of those variables isn't completely determined by a linear combination of the others.
View Public Profile
Find More Posts by barneyg