jump to main area
:::
A- A A+

Seminars

A simple statistical model for deciphering the cdc15- synchronized yeast cell cycle gene expression data

  • 2000-07-17 (Mon.), 03:30 AM
  • Recreation Hall, 2F, Institute of Statistical Science
  • Prof. Ker-Chau Li
  • Dept. of Statistics, University of California, Los Angeles

Abstract

The advent of microarray technology for measuring gene expression levels at the full genome scale has generated an extremely rich data source which is public-accessible through internet. This allows researchers with different disciplines to explore the wealth of information contained in the same data from various perspectives. In this talk, we will illustrate how simple statistical models can be employed for helping the decipherment of a data set available at http:cellcycle-www.standford.edu . We focus on one experiment in which a strain of yeast (cdc15-2) was incubated at a high temperature (35 degrees C) for a long time, causing cdc15 arrest. Cells were then shifted back to a low temperature (23 degrees C) and the monitoring of gene expression is taken every 10 min for 300 min. The starting point of our approach is purely statistical, relying on as little biological knowledge as possible. We treat the expression levels at different time points as a single time series curve for each gene and the goal is to find a way of organizing/explaining a variety of patterns observed over 6000 such curves. Our model describes each curve as the superposition of three basis curves plus errors. The loading coefficients for each basis vary from gene to gene. The selection of basis curve is carried out via the principal component analysis. The second and the third basis curves are then combined for sorting out genes with clear cycle patterns. With help from known genes, phases G1, S, G2, M, M/G1, can be distinguished easily from a compass plot constructed from the loading coefficients, showing the time to peak expression. A comparison with the cycle-regulated genes reported in Spellman et al (1998) is made. A rather unexpected finding is that the first basis curve oscillates regularly every 10 min from time 90 min to time 270 min. A large group of genes with this oscillation pattern is found. Implication of this finding is yet to be explored. This work is conducted jointly with Ming Yan and Robert Yuan.

Update:
scroll to top