jump to main area
:::
A- A A+

Seminars

Statistical Physics Approach to Information Categorization of Symbolic Sequences

  • 2004-07-09 (Fri.), 10:30 AM
  • Recreation Hall, 2F, Institute of Statistical Science
  • Professor Chung-Kang Peng
  • Harvard Medical School, USA

Abstract

We propose a systematic approach to categorize information carried by symbolic sequences based on their usage of repetitive patterns. We proposed a simple formula to quantify the "dis-similarity" between two symbolic sequences. This dis-similarity index comparing two symbolic sequences is closely related to the Shannon entropy and rank order of these repetitive patterns. The physical meaning of this dis-similarity index can be easily understood by applying fundamental statistical physics concepts to dynamical systems. Finally, to illustrate that this generic approach is applicable to a wide range of real-world problems, we apply our algorithm to study literary texts, DNA sequences, and biological time series.

Update:
scroll to top