Weighted Correlation Network Analysis: Biological Insights From a Million of Correlations
- 2015-11-23 (Mon.), 10:30 AM
- Recreation Hall, 2F, Institute of Statistical Science
- Prof. Peter Langfelder
- Univ. of California, Los Angeles, USA
Abstract
The ever-increasing amount of high-throughput biological data holds great promise in extending our understanding of biology and tackling human disease. Network methods are well-suited for these tasks because they are both powerful and intuitive, and indeed are widely used to analyze gene expression, methylation, proteomic, fMRI and other types of data. In this talk I will introduce Weighted Correlation Network Analysis (WGCNA) methods. WGCNA constructs a weighted network of all variables (for example, gene expression profiles), finds co-expression modules that often group together genes with common functional annotation, identifies modules that are associated with interesting clinical traits, and selects the key genes in each module for further research or validation. I will next briefly describe advanced methods for studying differences and commonalities in module organization of networks. These methods answer questions such as 1) Are gene modules found in one data set (organism, condition, tissue) also present in another, independent data set? 2) Which modules are universally present ("consensus") across several (2 or more) independent input data sets? 3) Are module associations with disease status and among each other preserved or do they differ among the input data sets? I will illustrate WGCNA on applications to Huntington's Disease, a progressive neurodegenerative disease caused by a mutation in the Huntingtin (HTT) gene. Although the genetic cause of Huntington's Disease is known, the mechanism of its action is not fully understood. WGCNA analysis of data from patients and HD model mice sheds more light on possible biological pathways and mechanisms involved in the disease pathogenesis. Module preservation analysis helps in understanding which aspects of the disease are well replicated in model organisms while consensus modules provide insights into brain region-specific aspects of the disease.