Project Details
Computational and mathematical approaches for statistical sequence alignment and phylogenetic inference on emerging parallel architectures
Co-Applicant
Professor Dr. Arndt von Haeseler
Subject Area
Bioinformatics and Theoretical Biology
Term
from 2011 to 2016
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 200966394
Bioinformatics is currently facing two challenges. Firstly, significant advances in sequencing techniques (454, Solexa) are generating an unprecedented amount of molecular data. Hence, data acquisition is no longer a problem but rather data analysis, especially in molecular evolution. Secondly, the field of parallel computing is facing the multi-core revolution on general purpose CPUs and a plethora of novel accelerator technologies such as GPUs (Graphics Processing Units). Therefore, parallel computing is becoming feasible at the level of personal computers. Nevertheless, biological data stored in public databases (e.g., GenBank) increases at a significantly higher rate than computational power. Thus, we need to substantially improve the respective models, data structures, and algorithms for data analysis. Here, we propose to tackle these challenges for the two closely related and intertwined fields of Statistical Multiple Sequence Alignment (sMSA) and Phylogenetic Inference (PI) via an integrated approach. We will develop a highly optimized, portable, parallelized, and versatile library for sMSA and PI. We will also improve statistical models for sMSA, search heuristics for PI, and integrate them in a next-generation bioinformatics tool for evolutionary biology. The underlying idea is to develop models and methods in such a way that they will be scalable on all modern multi-core, accelerator, and supercomputer architectures.
DFG Programme
Research Grants
International Connection
Austria, Vietnam
Participating Person
Dr. Le Sy Vinh