Research Staff


Li, Ker-Chau

Institute of Statistical Science Academia Sinica
Taipei 11529, Taiwan, R.O.C.
Tel: 886-2-2787-5696
FAX : 886-2-2788-6833

Special links Research Interests
  • Bioinformatics, systems biology, lung cancer studies,
    High dimensional data analysis, Large ensembles of time series, Medical image analysis, Machine learning, Statistical graphics, Bayesian computation, Regression, Censoring, Experimental design, Survey sampling.

    Li is best known for introducing sliced inverse regression (SIR) and principal Hessian direction (PHD), two fundamental dimension reduction methods for high dimensional data analysis. Starting from 2000, his research interest turned to the emerging field of computation/mathematics/statistics in genome biology. In 2002, he published a paper in Proceedings of Academy of Science, featuring the novel method of liquid association (LA) for microarray gene expression analysis. He is currently leading a research group in UCLA and in Academia Sinica to continue the development of methods for utilizing multiple sources of gene expression profiling, genetic markers, complex disease phenotypes and traits. A website offers on-line computation based on LA and related methods for gene expression studies. He is also collaborating with Dr. Pan Chyr Yang of National Taiwan University and his colleagues on integrative cancer biology.

    Click for larger image Liquid association. (a) Association between genes X and Y as mediated by gene Z. When gene Z is expressed at the high level (red), a positive correlation between X and Y is observed. The association changes as the expression of Z is lowered. It eventually becomes a negative trend (green). There are two basic ways (shown in panels b and c) to apply the liquid association (LA) scoring system to guide a genome-wide search. (b) When two genes X and Y are given, compute LA score LA(X, Y|Z) for every gene Z first and then output a short list of high score genes Z1, Z2, and so on. (c) When only one gene X is given, compute LA score LA(X, Y|Z) for every pair of genes X,Y first and then output a short list of high score gene pairs Y1,Z1, Y2,Z2, and so on. Li et al. Genome Biology 2007 8:R205 doi:10.1186/gb-2007-8-10-r205

  • Liquid association. Correlation is a simple and powerful method for analyzing gene expression data.Two genes with positively correlated expression profiles are likely to be functionally associated. They may participate in the same biological process. However, functionally associated genes may not be correlated for a variety of reasons. For instance, they may not be regulated at the transcription level. Another common situation is that most genes have multiple functions. Depending on the cellular needs, co-expressed genes may become uncorrelated or even turn into contra-expressed. Liquid association (LA), as opposed to "steady" association, is designed to quantify the change of correlation between two genes as a relevant cellular state variable changes. There is no need to specify the state variable explicitly. A highlighted example is the elucidation of gene expression for the urea cycle in yeast. This pathway controls both the biosynthesis and degradation of the amino acid, arginine. LA is able to find the correct genes which have to be turned on, as well as the correct genes which have to be turned off at the same time, so that any wasteful immediate degradation of newly synthesized arginine can be avoided.
  • Sliced inverse regression (SIR). Many statistical methods are known to suffer from urse of dimensionality? they break down easily when dealing with high dimensional data. How to reduce dimensionality is a long-standing issue. Ad hoc methods such as principal component analysis and partial least squares have been advocated. Yet the associated issue about possible nformation loss due to improper dimension reduction is rarely addressed. The SIR methodology helps reshape this area by presenting an effective dimension reduction framework for theorizing both issues.
  • Principal Hessian direction. Another effective dimension reduction method.


1981 Ph.D., Statistics, University of California, Berkeley. (Advisor: Jack Kiefer)
1975 B.S., Mathematics, National Taiwan University

Professional Experience

2006-present Distinguished Research Fellow, Institute of Statistical Science, Academia Sinica
2006-2012 Director, Institute of Statistical Science, Academia Sinica
2000-2002 graduate vice chair, Statistics Department, UCLA
1999-present Professor, Statistics Department, UCLA
1989-presentProfessor, Mathematics Department, UCLA
1984-1989 Associate Professor, Mathematics Department, UCLA
1981-1984 Assistant Professor, Statistics Department, Purdue University

Professional Activities

1999-2002 Co-Editor of Statistica Sinica
1989-1994 Associate editor of Annals of Statistics
1991-1999 Associate editor of Statistica Sinica
1993-2001 Associate editor of Computational Statistics


2014 Elected Member, The World Academy of Sciences (2014)
2012 Academician, Academia Sinica
2003 Medallion Lecturer, IMS; 1993 Guggenheim Fellow
1991 NSF/ASA/NIST fellow; 1990 JASA theory and methods Editor's invited speaker in Joint Statistical Meetings
1989 IMS Fellow; 1981 elected member of Phi Beta Kappa
1981 B. Friedman Memorial Prize in Applied Mathematics, U.C. Berkeley


[1] Sung-Liang Yu,Hsuan-Yu Chen, Gee-Chen Chang,6Chih-Yi Chen,Huei-Wen Chen, Sher Singh,Chiou-Ling Cheng, Chong-Jen Yu, Yung-Chie Lee, Han-Shiang Chen,Te-Jen Su, Ching-Cheng Chiang,Han-Ni Li,Qi-Sheng Hong, Hsin-Yuan Su, Chun-Chieh Chen,Wan-Jiun Chen, Chun-Chi Liu,Wing-Kai Chan,Wei J. Chen, Ker-Chau Li,Jeremy J.W. Chen, and Pan-Chyr Yang (2008) MicroRNA Signature Predicts Survival and Relapse in Lung Cancer. Cancer Cell 13, 48?7.

[2] Li, KC, Palotie A, Yuan, S, Bronnikov, D., Chen D., Wei X., Choi, O., Saarela J., Peltonen L. (2007) Finding disease candidate genes by liquid association. Genome Biology, 8, R205. oi:10.1186/gb-2007-8-10-r205.

[3] Yuan, S., and Li. K.C. (2007) Context-dependent Clustering for Dynamic Cellular State Modeling of Microarray Gene Expression. Bioinformatics 2007; 15;23(22):3039-47.

[4] Wei Sun; Tianwei Yu; Ker-Chau Li
(2007). Detection of eQTL modules mediated by activity levels of transcription factors. Bioinformatics; 2007 Sep 1;23(17):2290-7.

[5] Chun-Chi Liu, Chin-Chung Lin, Ker-Chau Li, Wen-Shyen E. Chen, Jiun-Ching Chen, Ming-Te Yang, Pan-Chyr Yang, Pei-Chun Chang, and Jeremy J.W. Chen. (2007) Genome-wide identification of the specific oligonucleotides using artificial neural network and computational genomic analysis. BMC Bioinformatics. 8:164.

[6] Tianwei Yu, Hui Ye, Wei Sun, Ker-Chau Li, Zugen Chen, Sharoni Jacobs, Dione K Bailey, David T Wong and Xiaofeng Zhou (2007). A forward-backward fragment assembling algorithm for the identification of genomic amplification and deletion breakpoints using high-density single nucleotide polymorphism (SNP) array. BMC Bioinformatics, 8:145.

[7] Yu, T., and Li, K.C. (2005). Inference of transcriptional regulatory network by two-stage constrained space factor analysis. Bioinformatics 21, 4033-4038.

[8] Yu , T., Sun, W., Yuan , S., and Li, K.C. (2005). Study of coordinative gene expression at the biological process level. Bioinformatics 21 3651-3657.

[9] Li, K.C.o, Ching-Ti Liu, Wei Sun, Shinsheng Yuan and Tianwei Yu (2004). A system for enhancing genome-wide co-expression dynamics study. Proceedings of National Academy of Sciences . 101 , 15561-15566.

[10] Xie, J., Li, K.C., and Bina, M. (2004) A Bayesian Insertion/Deletion Algorithm for Distant Protein Motif Searching via Entropy Filtering. J. American Statistical Association , 99, 409-420.

[11] Li, K.C., and Yuan, S. (2004) A functional genomic study on NCI's anticancer drug screen. The Pharmacogenomics Journal, 4, 127-135.

[12] Li, K.C., Aragon, Y, Shedden, K. and Thomos-Agan C., C.(2003). Dimension reduction for multivariate response data. Journal of American Statistical Association. 98, 99-106.

[13] Li, K.C. (2002) Genome-wide co-expression dynamics: theory and application. Proceedings of National Academy of Science . 99, 16875-16880.

[14] Li, K.C., Yan, M. and Yuan, S. (2002) A simple statistical model for depicting the cdc-15 synchronized yeast cell cycle-regulated gene expression data. Statistica Sinica, 12, 141-158.

[15] Li, K.C. and Shedden. K (2002). Identification of shared common components in large ensembles of time series using dimension reduction. Journal of American Statistical Association, 97, 759-765.

[16] Li, K.C. and Shedden, K. (2001). Monte Carlo deconvolution of digital signals guided by the inverse filter. Journal of Amer. Stat. Assoc. 96, 1014-1021.

[17] Li, K.C., Lue, H.H, and Chen, C.H. (2000) Interactive tree-structured regression via principal Hessian directions. Journal Amer. Statist. Assoc. 95, 547-560.

[18] Li, K.C., J.L. Wang, and C.H. Chen (1999). Dimension reduction for censored regression data. Ann. Stat.. 27, 1-23.

[19] Li, K. C. (1992). On principal Hessian directions for data visualization and dimension reduction : another application of Stein's lemma. J. Ameri. Stat. Assoc. 87, 1025-1039.

[20] Li, K. C. (1991). Sliced inverse regression for dimension reduction, with discussions. J. Amer. Statist. Assoc. 86, 316-342.