Perhapes a easy question but darned if I can find the answer in my reference books.

If you are looking at the interaction between a continuous and catagorical variable how to you calculate the interactionterm ? I understand if you had two continuous variables the term would of course be X12 = X1 * X2, but how do you create an equivilent term if X2 is in fact a series of dummy coded variables (i.e. 0 or 1)
I suspect I will feel silly once someone provides an answer..

Robert Butler
Participant

If your categorical variable has only two levels [0,1] then you build the interaction the same way you do for continuous.  If the categorical variable has more than 2 levels then you will need to create a set of dummy variables and build the interaction terms with these.
For instance, if the categorical has 3 levels 1,2,3 then you will need to build the dummy variables d1 and d2 such that
If Cat1 = 1 then d1 = 1 and d2 = 0
If Cat1 = 2 then d1 = 0 and d2 = 1
If Cat1 = 3 then d1 = 0 and d2 = 0
Then for the interaction of continuous X1 and Cat1 the terms in the model would be:
X1*d1 and X1*d2
and the full model for the regression would be
Y = fn(X1, d1, d2, X1*d1, X1*d2)
The book Applied Regression Analysis 2nd Edition by Draper and Smith has a whole section on the use of dummy variables in regression.

