Optimizing irreversible Markov-chain codes for manybody simulation


Sebastian Kapfer
Theoretische Physik 1
FAU Erlangen-Nürnberg

Project Overview

The goal of the project is to study the performance characteristics of postlhc, a simulation program aimed at two-dimensional classical manybody systems and based upon the irreversible Markov-Chain Monte Carlo method. This application is particularly interesting and timely for two reasons. First, the problem of melting in two dimensions is a fundamental question in statistical mechanics that has remained unsolved for over 60 years. A breakthrough came only recently via numerical solutions, some by the author. Success in these simulations has relied on a new class of irreversible Monte Carlo algorithms that breaks with the detailed-balance dogma and achieves faster convergence than conventional, reversible algorithms. There is currently a large interest in irreversible Markov chains. A particularly interesting class are event-chain algorithms which can be derived systematically [15] (see appendix for further details). Experience with the efficient implementation and optimization of such algorithms is, however, very limited at this time.

The postlhc code is complete, tested and available online [10]. It consists of around 5000 lines of C++11, run in a single thread. The code implements the long-range version of the event-chain algorithm. The template mechanism of C++ is used to produce efficient code specialized for various dimensionality, and interactions between particles

A run of the code is parametrized by a state point (physical variables like temperature and density) and a seed for the pseudo-random number generator. The Markov chain is initialized with a starting configuration, for example with particles on a lattice. It is then evolved by a continuous avalanche dynamics that can be simulated in an efficient event-driven fashion:

  1. At any time, a single particle is active, and moves at constant velocity.
  2. For all other particles, a (possibly random) interaction time is determined. The earliest event takes precedence.
  3. The system is fast-forwarded to the event time; the offending particle becomes the new active particle.

In this way, the initial starting configuration is evolved into an asymptotically unbiased sample from the stationary distribution, usually the thermal (Boltzmann) distribution. Approach to stationarity can be slow, and large simulations (N ∼ 100000 particles) can take several months to converge.

The code spends the majority of its time in a tight loop which predicts event times. Scaling beyond a single core is achieved by means of naive parallelization, by assigning a different state point and/or random seed (“instance”) to each CPU core.