jump to main area
:::
A- A A+

Seminars

Missing Data and Multiple Imputation

  • 2002-12-06 (Fri.), 10:30 AM
  • Recreation Hall, 2F, Institute of Statistical Science
  • Dept. of Psychiatry
  • Univ. of California Los Angeles USA

Abstract

Multiple imputation provides a useful strategy for dealing with data sets with missing values. Instead of filling in a single value for each missing value, Rubin's (1987) multiple imputation strategy replaces each missing value with two or more plausible values that represent the uncertainty about the right value to impute. Each of the two or more resulting complete data sets is then analyzed using standard complete-data methods. These analyses are combined to reflect both within-imputation variability and between-imputation variability. The talk will include a brief introduction to multiple imputation, a discussion of two primary imputation methods: all-at-once vs. one-variable-at-a-time, and applications to the IMPACT study, a multi-center randomized controlled trial of a disease management program for late life depression. In this talk, we compare two approaches to handle incomplete data in the IMPACT study. The first approach is based on hot-deck multiple imputation of missing response, using a modified predicted mean matching method for item-nonresponse (Bell, 1999) and the approximate Bayesian bootstrap for unit-non-response (Lavori, Dawson and Shera 1995). In the second method, we apply multiple imputation based on the multivariate normal model using SAS PROC MI software. The two methods as well as complete-case analysis are compared in a simulation study. Overall both hot-deck multiple imputations performed well with good coverage rates for Monte Carlo means and testing intervention effects. For dichotomous variables, multiple imputation under the multivariate normal model has lower coverage in three variables, which were derived from a highly skewed variable. On the other hand, complete-case analysis showed that 47% varaibles had low coverage in Monte Carlo means, but it had good coverage in the intervention effects because the intervention and the control group have the same direction in bias and biases were cancelled.

Update:
scroll to top