Institute of Statistical Science Academia Sinica

Seminars

Seminars Seminars

A Cross-Validation Study for Reproducibility

2016-06-27 (Mon.), 10:30 AM
Recreation Hall, 2F, Institute of Statistical Science
Prof. Lo-Bin Chang
The Ohio State University, USA

Abstract

In recent years “reproducibility” has emerged as a key factor in evaluating applications of statistics to the biomedical sciences, for example predictors of disease phenotypes learned from high-throughput “omics” data.? Among other factors, validation of such predictors entails comparing the reported error rates, usually estimated by standard cross-validation, to the accuracy observed on additional data collected from new studies. Unfortunately, the rates originally published are frequently lower, and this discrepancy is then seen as a barrier to translational research. In this talk, I will provide a statistical formulation in the large sample limit to study this inconsistency based on the gap between the error rates in cross-study validation (CSV) and that in ordinary randomized cross-validation (RCV). Theoretical results cohere with the trends observed in practice: for any number m of studies, the cross-study error rate exceeds that of ordinary randomized cross-validation, the latter (averaged) increases with m, and both converge to the optimal rate.

Update：2025-10-31 19:32

Back