Machine Learning Bias Correction for Minimal-error Classifier and a Meta-analysis Framework for Sparse K-means in Genomic Applications
- 2013-06-10 (Mon.), 15:00 PM
- 中研院-統計所 2F 交誼廳
- 茶 會:下午14:40統計所二樓交誼廳
- Prof. George C. Tseng(曾 建 城 教授)
- Dept. of Biostatistics, Univ. of Pittsburgh, USA
Abstract
In this talk, I will cover two topics we recently developed for genomic applications. In the first part, we investigated the machine learning bias when one utilizes many classifiers and chooses the best to report. We studied the properties of the bias in relation to the sample size and classifiers used. We proposed an inverse power law method to correct the bias and compared it to conventional nested cross-validation. Finally we used large-scale empirical gene expression data to recommend a practical guideline for practitioners. In the second part, we extended the sparse K-means algorithm to a meta-analysis framework to combine multiple gene expression profiles for improved disease subtype discovery. The result showed more stable and accurate sample clustering to identify meaningful disease subtypes.?