jump to main area
:::
A- A A+

Seminars

Alignment-free Genome Comparison with Trinucleotide Usage Profile (TUP)

  • 2016-12-19 (Mon.), 10:30 AM
  • Recreation Hall, 2F, Institute of Statistical Science
  • Prof. Lih-Yuan Deng
  • Dept. of Mathematical Sciences, The Univ. of Memphis, USA

Abstract

We propose a new method, called Trinucleotide Usage Profile (TUP), to build a genome-wide phylogenetic tree for a large group of species. Each species contains a large number of genes and each gene has a long nucleotides sequence. The key issue is to find a macroscopic statistics to represent and characterize the whole-genome DNA information. The most popular method, called feature frequency profile (FFP), is finding the frequency distribution for all words of certain length over the whole genome sequence using (over-lapping) window of the same size. Unfortunately, in order to characterize the genome-wide information, the word length is often much larger than 3 (codon length). We propose an essential modification on the popular FFP method while maintaining typical word length of 3. The main idea is to summarize the sequence in a matrix of three rows corresponding to three reading frames and each row is the distribution on the (non-overlapping) words of length 3 for the corresponding reading frame. Based on the proposed TUP method, the empirical study showed that phylogenetic trees with strong biological support can be built.

Update:
scroll to top