High-order matrix-free finite element implementations with hybrid parallelization and improved data locality

Applicant

Dr. Martin Kronbichler
Prof. Dr.-Ing. Wolfgang A. Wall
Institute for Computational Mechanics
Technical University of Munich

Project Overview

This proposal is aimed to improve the application performance of high-order finite element and discontinuous Galerkin solvers in fluid dynamics based on matrix-free operator evaluation with sum factorization. The first proposed ingredient is an MPI-3.0 shared memory parallelization within nodes, which can be more effectively applied to a wide range of application kernels than the previously developed OpenMP hybrid parallelization. Also, this setup can leverage the already present MPI parallelization of setup routines, which are important in more sophisticated algorithms including moving meshes and adaptivity we plan to target in the future. The second ingredient is the fusion of memory-bound vector updates with the compute-heavy matrix-free operator evaluation in Krylov subspace solvers with specific preconditioners. The last proposed ingredient are new preconditioners for high-order discontinuous Galerkin methods, based on an approximate inversion of the block matrices of unknowns on each element. The aim with these methods is to spend more time on locally cacheable data in exchange for fewer passes through the data. The proposed work will be done in collaboration with the CFDLab at LRZ.