BACKGROUND: When using regression to control for confounding when estimating the effect of a treatment, controlling for a variable that affects the probability of receiving treatment but not the outcome reduces precision without any benefit in reducing bias. However, propensity methods require that the functional form of the association between a confounder and the treatment be correctly specified.
OBJECTIVES: To investigate how controlling for an instrumental variable consisting of the interaction between two confounders affects the bias in the estimated treatment effect.
Methods: Two uncorrelated normally distributed variables, X1 and X2, were simulated with mean 0 and variance 1. Then a treatment variable, T, was simulated with logodds (T = 1|X1, X2) = X1 + X2 + X1*X2. A normally distributed outcome was simulated as Y = X1 + X2 + T + €, where € had a normal distribution with mean 0 and variance 1. Samples of size 2,000 were simulated, 1,000 in total. The effect of T was estimated using linear regression models both including and excluding the interaction term X1*X2. Propensity scores were also calculated both including and excluding the interaction term, and inverse probability of treatment weighting used to estimate the effect of T.
RESULTS: When using regression to control for confounding, the mean bias in the treatment effect was 0.00 whether the interaction term was included in the regression or not. However, the standard deviation of the estimates was lower when the interaction term was not included (0.048 vs. 0.050). However, when using IPTW to control for confounding, omitting the instrumental interaction led to biased estimates of the treatment effect (mean bias -0.36, SD 0.18). The bias when the interaction term was included was reduced to 0.02 (SD 0.19).
CONCLUSIONS: When an interaction between two confounders affects the probability of receiving treatment but not the outcome, and regression is used to control for confounding, the interaction term should not be controlled for. On the other hand, there are situations in which omitting an instrumental variable from a propensity score model will lead to the introduction of bias.