A new document on what changes and what remains the same in regressions, when you change the inputs
Draft, Feb 19, 2010
Given a model
Y=Const +B_{1}X_{1}+B_{2}X_{2}+...B_{n}X_{n} + Residuals
Type of Change 
Effect on Coefficients (Bs) 
Effect on Tstatistic of that coefficient 
Effect on sample size of the model 
Effect on goodness of fit of the model 
1) Change of units of one variable, X_{1} 
Changes units of B_{1} 
No change to the TStatistic; Tstatistics are unitfree 
None 
None 
2) Inclusion of a new, formerly excluded category of variable X_{1} 
May or may not change B_{1}. If B_{1} was a comparison between nurses and lawyers, and the new added group are sociologists, B_{1} won’t change, if there are no other predictor variables. If there are other predictor variables, all coefficients will be changed. 
The Tstatistic will change, if for no other reason than the joint variance of the dependent variable Y is now different. 
Including new cases changes the N of the model 
Yes 
3) Inclusion of a new predictor variable, X_{m} 
All the coefficients are jointly estimated, so every new variable changes all the other coefficients already in the model. This is one reason we do multiple regression, to estimate coefficient B_{1} net of the effect of variable X_{m}. 
Yes 
Usually no change. That is, the inclusion of a new predictor variable will only change the sample size of the model if the new predictor variable has missing values. Any cases with missing values on any predictor variable are dropped automatically 
Yes. For the Rsquare, any new nonzero terms must improve the fit. Adjusted Rsquare will get better if the new terms improve the fit, and will get worse if the new terms make no difference 
4) Changing the excluded category of some variable already entered 
NO. The initial output reported by the software will be different, but all of the same comparisons as before can be recovered by combining the reported Bs, and when recovered they are the same 
No changes, when looking at the same comparisons 
No change 
No change 
5) Weighting with analytic weights 
Unless the weights are uniform, the weights will change the coefficients 
Yes 
No change to sample size using analytic weights, because analytic weights are weights rescaled to leave the sample size unchanged 
Yes 
6) Weighting with frequency weights 
Coefficients will behave the same as with analytic weights 
Dramatic changes here, because changed N will change the standard errors, and therefore also the Tstatistics 
Dramatic changes 
Yes 
7) Changing the sample size, N, of the dataset 
In theory, the expected value of B is not affected by changes in sample size. In practice, if you have a different sample (larger or smaller), B will be different because of sampling variation. You can easily take a random subset of any dataset and you will find that the Bs in the random subset are different from the overall Bs 
The expected values of Tstatistics are proportional to the square root of N. If you quadruple the sample size, you would expect Tstatistics to double, giving you greater power to reject null hypotheses. Of course, in a different sample, the actual Tstatistics will not be changed by exactly square root of N, because sampling variation comes into play. 
Yes (duh). 
Yes, largely because of the sampling variation. 














