Massiv-parallele Reporterassays (MPRA) zur funktionellen Aufarbeitung nicht-kodierender häufiger Risikovarianten für nichtsyndromale Lippen-Kiefer-Gaumenspalten.
Zusammenfassung der Projektergebnisse
Genomic technologies have enabled the identification of thousands of risk variants for both monogenic and multifactorial diseases. Despite this success, the translation of those statistical findings into biologically meaningful results has lacked behind, largely due to the vast number of common risk variants at associated loci, and the limitations in separating causal from benign variants by statistics alone. Nonsyndromic cleft lip with or without cleft palate (nsCL/P) is a multifactorial disorder for which 45 risk loci have been identified through genome-wide association studies (GWAS). These risk loci map to non-coding regions, suggesting regulatory effects on downstream target genes, and these are likely to occur in a tissue- and time specific manner. Also, most risk regions encompass several dozens to hundreds of potentially causal SNPs due to the haplotype structure. To test for potential regulatory effects of as many nsCL/P risk variants as possible, in vitro and in parallel, we here proposed to use next-generation-sequencing (NGS) technology and perform massive parallel reporter assays (MPRA) in cell types of relevance for craniofacial development. We first established cultivation and transfection protocols for human embryonic palatal mesenchymal (HEPM) cells. In parallel, we established MPRA in the lab using an available IRF6 enhancer MPRA, comprising oligos with single basepair exchanges of each position within a 600bp non-coding region. This regulatory element has been previously identified as enhancer for IRF6 in HaCaT cells, and both common and rare variants have been associated with nsCL/P. We performed the IRF6_MPRA in HaCaT and control HEK cells in three replicates, and generated a map of alleles with significant regulatory activity. When we integrated these data with common and rare risk alleles observed in nsCL/P patients, we observed significant effects for one of three common variants, and an overrepresentation of “active” rare variants in the patients as opposed to controls. With this project we have successfully shown MPRA feasibility in the lab, and we are currently following up on those findings to confirm the role of the IRF6 enhancer in nsCL/P. We next analysed common nsCL/P risk variants and selected candidates based on lead variants from a recent in-house GWAS data on nsCL/P (odds ratio > 1.2, linkage disequilibrium r2>0.8). We also retrieved positive and negative control sequences from public databases, and confirmed their suitability as controls by individual standard dual-luciferase assays in HEPM. During the course of the MPRA design, we identified weaknesses in some of the online MPRA tools, and developed an own MPRA design software in Python. Using this pipeline, we designed 230bp-long oligos for 2,278 alleles at 1,053 risk variants from 31 loci, as well as 103 positive and 59 negative control sequences. For each sequence, at least 100 barcodes were included, resulting in a final MPRA size of 244,000 oligos. We are currently cloning this MPRA library into the plasmid backbone and will then transfect HEPM and HaCaT cells. This final part of the project has been delayed due to the Corona pandemics and the extent of the additional projects that were performed. However, the expertise gained as part of the IRF6_MPRA experiment provides valuable input into the final analysis of the MPRA on common variants. The results of our project will provide novel insights into the causal architecture of some of the nsCL/P risk loci and will enable generation of novel hypotheses for further in-depth functional follow-up.