Optimization of the ALF implementation of the auxiliary-field quantum Monte Carlo algorithm: porting to GPUS and symmetry considerations

Applicant

Prof. Dr. Fakher Assaad
Institut für Theoretische Physik
Universität Würzburg

Project Overview

Recently, and partially with the help of KONWIHR funding, we have developed the Algorithms for Lattice Fermions (ALF) package. It provides a very general implementation of the auxiliary field quantum Monte Carlo (AFQMC) method. The ALF package is anchored in the domain of correlated electron systems. Starting from the basic laws of quantum mechanics to describe a system of interacting electrons, our aim is to understand the fascinating physics of emergent collective and critical phenomena. Emergence, as well as criticality, are properties of the thermodynamic limit, a limit that we clearly cannot reach either in numerical simulations or in experimental setups. However, the system inverse size sets an energy scale above which emergent phenomena can be studied. To generate reliable results we have to make sure that this energy scale is the smallest one in the problem. Hence, the impact of our results crucially depends on the system sizes we can achieve. Here we will follow two routes to optimize the ALF package so as to expand the range of the achievable lattice sizes.

For Monte Carlo methods, the simple MPI parallelization, in which each MPI process carries out an independent run, seems optimal to provide a route to achieve the best speedup. However, this simple parallelization is not really practical due to the need to equilibrate (thermalize) the Markov chain before actual QMC simulation. Thus, the present version of ALF employs an OpenMP-MPI hybrid approach, where we aim at reducing the computational time of a single simulation using OpenMP parallelization. This strategy is based on the premise that future architectures will come with an increasing number of CPUs and that parallelization of a single process will provide the desired speedup as the system size grows.

Accelerated computing, in which CPUs and GPUs are combined to achieve speedup, is a route followed by industry to provide increased computational resources. One of the central aims of this project is to test various strategies to port ALF on GPUs, allowing the package to fully benefit from accelerated computing.

Our second aim is to achieve a speedup by implementing symmetry consideration in the ALF package. ALF Hamiltonians have a so-called flavor index in which the Hamiltonian is [FA: block] diagonal and, in many cases of interest, a symmetry links the different flavor sectors. Using this symmetry to reduce the number of floating-point operations will greatly improve performance.