View Single Post
11-09-2012, 02:45 PM
Czech Your Math
Registered User
Czech Your Math's Avatar
Join Date: Jan 2006
Location: bohemia
Country: Czech_ Republic
Posts: 4,846
vCash: 500
Originally Posted by barneyg View Post
I'll try my best to get this back on track..

You report coefficients on each regressor but it's hard to really make sense of the results without the t-statistics. Given the insignificant drop in R-squared when you drop Xn I would assume that 1.05 coefficient is insignificant but I'd like to see the others.

No, Xn actually appeared significant to me (Bn was almost 4x SEn). The least significant appeared to be Xp with Bp ~1.5x SEp. What's strange is that the individual correlations were:

Xn = 7%, Xp = 47%, Xe = 27%, and Xg = (-10%)

I thought Xn was coincidentally capturing a lot of the other variables, so I wanted to see what the coefficients looked like without Xn as one of the variables.

Originally Posted by barneyg View Post
I don't think you can make a judgement on how solid those models are for predicting anything based on R-squared, as that Y series is probably fairly stable. If you regressed Y on a constant you'd get a pretty high R-squared too.
What's the best way to judge models with a relatively stable Y? Just look at the significance of each individual coefficient or is there a better way of judging/comparing models in such cases?

Originally Posted by barneyg View Post
My main takeaway:

Y is adjusted to 6 GPG (HR method), right? If your regression used all the players in the league, by definition you'd get Xg=0, because that's what the adjustment does. You have Xg>0 for the top 5% of players, that means the top guys are further away from the mean in high-scoring seasons than in low-scoring seasons. The adjustment may not bring down the top guys enough in high scoring seasons.

Am I correct?
Yes, you seem to understand the process and model well, and are correct in each case. One caveat is that Xg had a small, negative correlation (-10%), but when used as a variable in the models the coefficient became positive and was also the most significant in the 4-variable model (Bg ~7x SEg).

Can you help with this question as well:

I know there must be a way to find a more correct estimate of the difficulty/quality of each season for top players to score points, and I believe using regression will likely yield the most correct estimate possible. Unfortunately, my skills with it are obviously limited, and no one else seems interested in pursuing this avenue, despite my previous suggestions to do so.

Czech Your Math is offline   Reply With Quote