jump to main area
:::
A- A A+

Seminars

A weighted estimated likelihood approach to incomplete covariate data

  • 2000-02-14 (Mon.), 10:30 AM
  • Recreation Hall, 2F, Institute of Statistical Science
  • Prof.Yi-Hau Chen
  • Department of Mathematics, National Taiwan Normal University

Abstract

In the whole study sample suppose that, either by happenstance or by study design, part of covariate data are missing or measured with error for some study subjects, called "nonvalidation subsample", while for the remaining subjects, called "validation subsample", data are completely observed. Further assume that the completely observed covariates Z are categorical and the sampling of the validation subsample depends only on Z and certain stratified version S of the outcome variable Y. Under such "missing at random"(MAR) assumption, in this work a weighted estimated likelihood approach is proposed to the estimation of regression parameters for general regression models with such kind of data. Without the need to specify a correct parametric model relating the incompletely observed covariates X to Z, the proposed approach uses a weighted estimate for the likelihood function based on the nonvalidation data, with the inverse of the sampling probabilities given X and Z as the weights. Compared with the existing methodologies under the same setting, the proposed method does not require validation members in each category defined by (S, Z), hence can be applied to more general two-stage sampling designs, e.g., the restrictive sampling design, as well as more complex missing data mechanisms, e.g., the truncated data. Large sample theories are derived for the proposed estimator, and in the simulation studies its small sample performances and efficiency properties are investigated. A real data set from an epidemiology study on leprosy is re-analyzed to illustrate the applicability of the proposed method to various study designs.

Update:
scroll to top