Why we need the lower-order-terms ...

This is the second article in our series about non-hierarchical models. Here i want to point out the problem of non-hierarchical regression models meaning models that include some higher-order-terms without the corresponding lower-order-terms.


Why we need the intercept ...

This is the first article in the series on non-hierarchical regression models. In this article we discuss the pitfalls when performing a regression without using the intercept term.

A common question is, if the intercept-term may be removed from a regression analysis in case it is not significant. Most of the time the answer to this question should be "No!" and I totally agree with that answer. But of course we would like to understand why!

A Discussion on Non-Hierarchical Regression Models

There is a certain point in many trainings where we talk about the advice to respect "model hierarchy" in the context of multiple linear regression models. Model hierarchy means: If there is a term of higher order included in the model, all corresponding terms of lower order should be in the model, too.

If you estimated a model $Y  = b_0 + b_1 X_1 + b_2 X_2 + b_{12} X_1 \cdot X_2 + \epsilon$ and see that $b_{12}$ is significant different from 0 the main effects $X_1$ and $X_2$ should stay in the model in any case. No matter if they are significant themselves or not.

In the following series of articles I want to give an explanation why we give this recommendation.

In the first article I will present the problems that occur if the intercept term drops out of the model. Of course this is the most simple kind of violating the model hierarchy. I give a motivation why we should prefer:

$$ y = b + b_1 x_1 + \epsilon $$

in nearly all cases over:

$$ y = b_1 x_1 + \epsilon $$

The second article will explain the difficulties that occur whenever you estimate a non-hierarchical model by removing terms of lower order, while keeping the corresponding terms of higher order. This includes not only 2 (and more)-factor-interactions but polynomial terms as well. So our general advice applies to models like

$$ y = b + b_1 x_1 + b_{11} x_1^2 + \epsilon$$

even though.

Of course there are exceptions from these general rules. The last article will give some examples, where one might use non-hierarchical models.