TIGP (BIO)—Investigating long-range cis-regulation mechanisms in Drosophila using deep learning
- 2025-03-27 (Thu.), 14:00 PM
- Auditorium, B1F, Institute of Statistical Science. In-person seminar, no online stream available.
- Delivered in English|Speaker bio: Please see the attachment below
- Prof. Tzu-Hsien Yang
- Department of Biomedical Engineering, National Cheng Kung University
Abstract
Cis-regulatory modules (CRMs), including enhancers, promoters, and insulators, can regulate metazoan gene transcription by coordinating transcription factor (TF) binding. CRMs can also interact with one another to control target gene expression. However, identifying the target genes of CRMs and the interactions between CRMs still remains challenging due to the complexity of regulatory mechanisms and limitations in current methods. This study presents three key advancements: automated literature screening for CRM regulators and targets, CRM interaction prediction, and genomic CRM target identification.
First, we designed a tool named DMLS (Drosophila Modular transcription-regulation Literature Screener), an automated text-mining tool that extracts CRM-related information from the literature. DMLS prescreens articles describing experimental results on CRM transcription regulation, identifies the studied CRM-associated target genes and TFs, and operates via an extendable pipeline. Second, we developed ACIP (Attention-based CRM Interaction Predictor), a deep learning model for CRM interaction prediction. Existing CRM interaction prediction tools suffer from low resolution or data contamination issues. To overcome the obstacles, ACIP adopted a chromosome-based data partitioning strategy and a cross-attention network to capture epigenetic crosstalk between CRMs for identifying CRM interactions. Finally, we constructed a deep-learning-based CRM target gene identification pipeline to predict CRM target genes beyond chromatin interactions. The pipeline integrates CRM interactions and the concept of CRM transcription for ncRNAs to boost the overall target gene identification performance. These results provide a systematic framework for research on CRM regulatory mechanisms, enhancing our ability to reconstruct transcriptional regulatory networks in metazoan species.