Wednesday, October 24, 2007

Underspecification of a model

To clarify what I was talking about today when I mentioned the problems with underspecifying one's model (that is, leaving out some relevant independent variables), one can look at a discussion by economist Razvan Vlaicu (in his econometrics' class notes) at . . .

What Happens if we Underspecify or Overspecify a Linear Model?"

Now in fact, there are situations where underspecification of the model may be, if not desirable, at least understandable. One possible situation is where there is multicollinearity (i.e., the independent variables in the model are correlated with each other). That was why I suggested that there may be a question of whether the ethnicity is correlated with the family structure variables. (That could be why they didn't include those variables, but it would have been desirable for them to let the reader know that.)

Also, in other types of regression, underspecifying a model can sometimes be desirable to achieve precision.

However, I just didn't see anything in this article to suggest it was an explicit decision to leave out those family structure variables.

Which is not to say that the article and the research is invalid -- there is still good reason to have faith in their conclusions, just that if it is underspecified then some of the values they felt were insignificant may actually prove significant with a more complete model and the overall significance of the findings may be skewed with an underspecified model.

No comments: