Harnessing CART for Causal Modeling
- 2014-01-20 (Mon.), 14:00 PM
- Recreation Hall, 2F, Institute of Statistical Science
- Prof. Galit Shmueli
- Statistics and Information Systems, Indian School of Business
Abstract
Classification and regression trees, and data mining algorithms in general, are used mostly for predicting individual observations and for variable selection. Yet, these goals are rarely the focus in social science research, where the main objective is causal explanation of overall population effects. While ideal causal modeling is based on randomized experiments, experiments are often impossible, unethical or expensive to perform. Hence, social scientists often rely on observational data for studying causality. Very large and rich observational datasets are now available in many fields. A major challenge is to infer causality from such data. In this talk, I'll introduce a novel approach of utilizing classification and regression trees for causal modeling. I will present a tree-based approach for impact assessment in self-selected intervention studies, and another for detecting confounding variables that lead to Simpson's Paradox. ? ?