jump to main area
:::
A- A A+

Postdoc Seminars

Bayesian Nonparametric Clustering and Association Studies for Large-scale SNP Observations

  • 2016-01-27 (Wed.), 11:00 AM
  • Recreation Hall, 2F, Institute of Statistical Science
  • The reception will be held at 10:40 at the lounge on the second floor of the Institute of Statistical Science Building
  • Dr. Charlotte Wang
  • Institute of Statistical Science, Academia Sinica

Abstract

When analyzing an enormous amount of SNPs data without prior biological information to define SNP-sets, clustering analysis is often considered as the first step of analysis. Then, a genetic association analysis is performed based on the results of clustering analysis. Even if a clustering procedure has been done in advance, the impact of its uncertainty on the subsequent association analysis is rarely assessed. In this talk, I will introduce the proposed Bayesian nonparametric clustering method which utilizes Dirichlet process mixture model. Our proposed method has the advantage that the number of clusters dose not need to be known and fixed before starting the clustering procedure, and the uncertainty in the number of clusters is also accounted for. Additionally, with the designed individualized genetic score for each SNP cluster for every subject, the subsequent regression model for association analysis is able to incorporate the information from a large-scale SNP data, and yet with a much smaller number of explanatory variables. The inference of cluster allocation, the strength of association of each SNP cluster, and the susceptibility of each SNP is based on their posterior samples from Markov chain Monte Carlo methods and the Binder loss information. We exemplify this Bayesian nonparametric strategy in a genome-wide association study of Crohn’s disease from WTCCC database.

Update:
scroll to top