jump to main area
:::
A- A A+

Postdoc Seminars

Diversity Estimation in the Presence of Spurious Singletons or Super-Doubletons

  • 2016-08-10 (Wed.), 11:00 AM
  • Recreation Hall, 2F, Institute of Statistical Science
  • The reception will be held at 10:40 at the lounge on the second floor of the Institute of Statistical Science Building
  • Dr. Chun-Huo Chiu
  • Institute of Statistics, National Tsing Hua University

Abstract

In the last decade, advances in high-throughput DNA sequencing have opened a novel way to assess hyper-diverse microbial communities. However, estimating and comparing microbial diversity are statistically challenging due to the sequencing errors for low-frequency counts, producing spurious singletons. ???? ?In the traditional wild field surveys, due to time and effort limitation, only species presence/absence records in multiple sampling units are available and many data only consist of singletons and super-doubletons (species observed in more than one sampling unit). However, most previous nonparametric estimators of the number of undetected species are based on singleton (species detected in only one sampling unit) and doubletons (species detected in only two sampling units). ????? Using Good-Turing frequency formula, we present for the first time a nonparametric estimator of the true singleton or doubleton to estimate unseen richness. For microbial data, based on the estimated singleton count and the original non-singleton frequency counts, we propose two statistical approaches to make fair comparisons of microbial diversity across multiple communities. (1) A non-asymptotic approach based on standardizing sample size or sample completeness. (2) An asymptotic approach which depicts the estimated asymptotic diversity.

Update:
scroll to top