High Performance Emulation for File System like I/O (HiPEF)

Applicant

Prof. Dr. Burkhard Rost
Lehrstuhl: I12 – Department for Bioinformatics and Computational Biology
Technische Universität München

Project Overview

Traditional HighPerformance Computing (HPC) requires specifically designed communication interfaces for scientific simulation or machinelearning tools. Due to its massively parallel nature, HPC commonly relies on highly automated interprocess communication which often uses filebased implementations. For hundreds of interwoven processes, this entails an explosion of small files putting strain on traditional file systems. To alleviate this strain, we developed a highperformance file system targeted at storing many small files in a unified and scalable way in a background database while being transparent to processes.

To this end, we leveraged the capabilities of FUSE, MongoDB, and Rust to create a userspace file system capable of efficiently working with small flat files and potentially storing additional metadata. We evaluated the file system by putting it to several read and write tests and by including it in our PredictProtein webservice. To facilitate easy usage, the code comes packaged into a docker image and can be deployed using dockercompose.
During our project, we closely collaborated with the Leibniz Supercomputing Centre which provided guidance and computational resources.