TIGP (BIO)—Leveraging Distant Relatedness in Biobanks to Identify Undiagnosed Mendelian Disease Cases
- 2024-09-26 (Thu.), 14:00 PM
- Auditorium, B1F, Institute of Statistical Science. In-person seminar, no online stream available.
- Delivered in English|Speaker bio: Please see the attachment below
- Dr. Hung-Hsin Chen
- Institute of Biomedical Sciences, Academia Sinica
Abstract
Most biobanks recruit participants from local populations, resulting in significant undocumented (cryptic) relationships between participants. While cryptic relatedness can introduce bias in traditional association studies, shared genomic regions create opportunities for identifying undiagnosed/misdiagnosed carriers of pathogenic variants. Here, we demonstrate a powerful approach to identify patients at risk of long QT syndrome through identity-by-descent (IBD)-based genotype inference. Long QT syndrome is a potentially lethal arrythmia disease, but 39% of patients experienced diagnostic delay after first symptom and misdiagnosis is common. BioVU comprises DNA samples from 245,000 individuals and their linked electronic health record (EHR), including 69,817 genotyped individuals of European ancestry. We utilized twelve long QT syndrome clinical patients from four cryptically related families, who are confirmed KCNE1 causal mutation carriers (D76N, rs74315445). We then identified BioVU subjects with IBD (via hap-IBD with SHAPEIT4 phasing) with any proband across KCNE1. Fourteen BioVU Europeans share the same IBD segment(≥3cM) with seven probands, and thirteen share with two other probands. Only 23/27 identified subjects were array genotyped as mutation carriers suggesting potentially erroneous genotyping, and 68.5% of all mutation carriers in BioVU were captured by our IBD-based approach. Confirmation sequencing is on-going. EHR analysis revealed only one identified mutation carrier has been diagnosed with long QT syndrome, but, among the sixteen subjects with available electrocardiogram in EHR, three have prolonged QTc interval and another five have pathologically long QTc intervals(>490msec), illustrating the utility of our shared segment approach for identifying undiagnosed/misdiagnosed patients carrying pathogenic causal variants in a large biobank.