Using Regression to Adjust "Adjusted Points" for Top Tier Players '68-12
View Single Post
11-19-2012, 06:51 AM
Join Date: Apr 2007
Originally Posted by
Czech Your Math
I cannot get coefficients to generate for a particular model using LINEST. I have successfully used LINEST for other models, so it's likely a problem with the model (and its variables). The model is as follows:
one group of X variables are discrete variables for season (let's say there are 32 seasons, so it's either 0 or 1 for each possible season... it will only have a value of 1 for 1/32 of those for each player-season)
the next group of X variables are discrete variables for the player's age (if we use age ranges of 18-40, then the value is either 0 or 1 for each of 23 possible ages... and again it will only have a value of 1 for 1/23 of those for each player-season)
the next group of X variables are discrete and are for the player himself (the value is either 0 or 1 for each of the Q players in the study... and again will only have a value of 1 for 1/Q of those for each player-season)
there are possible variables that I would like to add, but if I do, will wait until I am able to successfully generate coefficients for the model as it already stands
I stopped ~ a dozen players with a total of ~170 player-seasons. I thought since the degrees of freedom are df = N - k - 1 = 170 - (32 + 23 + 12) - 1 = 170- 67 - 1 = 102, that coefficients should generate, but I'm obviously missing something and my linear regression knowledge is relatively basic and quite rusty.
Can anyone tell me why the coefficients won't generate? I don't want to put substantial time into this if it's not going to work. Any help would be appreciated.
If I understand correctly, your model is
Y = b1*X(s1980) + ... + b32*X(s2011) + b33*X(a18) + ... + b55*X(a40) + b56*X(p1) + ... b67*X(p12)
where X(s1980)...X(s2011) are season dummy variables (also called indicator variables), X(a18)...X(a40) are player age dummies, and X(p1)...X(p12) are player (name) dummies. I'm not sure why you want those player dummies in there (p1..p12).
But to get back to your question, it's not a question of degrees of freedom. The regressors in your model must be linearly independent, and they aren't. For example, right now for every player you have X(s2011) = 1 - X(s1980) - X(s1981) - .... - X(s2010) i.e. the sum of those 32 dummy variables is 1... same thing for the other 2 types, the sum of all variables of the same type for a given player is 1.
A simple solution is to drop one of the dummies for each type, ie. drop X(s1980), X(a18), and X(p1). You will still get an error for some age coefficients if your sample doesn't have anyone playing up to age 40 or as early as 18 but the rest of the model should work.
View Public Profile
Find More Posts by barneyg