A Scalable Implementation of Complex Substitution Models for Phylogenetic Inference from Large-Scale Genomic Data

Applicant

Prof. Dr. Gert Wörheide
Lehrstuhl für Paläontologie & Geobiologie
LMU München Richard-Wagner-Str. 10
80999 München

Project Overview

Software to reconstruct phylogenetic relationships of living organisms – necessary to reconstruct the evolution and diversification of life on our planet – using DNA/Amino Acid data is based on models of the substitution process that describes how amino acids/nucleotides change over evolutionary time. The true process of change is complex and more parameter-rich models that reflect this complexity have a better fit to real data. More complex and better fitting models result in more accurate trees and are essential, in particular for accurately reconstructing the most controversial parts of phylogenetic trees. The most accurate and complex models are highly CPU intensive and their current inefficient implementation and poor parallelisation potential result in a major bottleneck in phylogenetics. The proposal is to reimplement these complex models in the new software RevBayes and adapt this software to the LRZ supercomputing infrastructure, with the aim of making the accurate reconstruction of phylogenetic trees significantly more efficient.