Integrating HPC into an agile and cloud-focused environment at CERN
Published online: 17 September 2019
CERN’s batch and grid services are mainly focused on High Throughput computing (HTC) for processing data from the Large Hadron Collider (LHC) and other experiments. However, part of the user community requires High Performance Computing (HPC) for massively parallel applications across many cores on MPI-enabled infrastructure. This contribution addresses the implementation of HPC infrastructure at CERN for Lattice QCD application development, as well as for different types of simulations for the accelerator and technology sector at CERN. Our approach has been to integrate the HPC facilities as far as possible with the HTC services in our data centre, and to take advantage of an agile infrastructure for updates, configuration and deployment. The HPC cluster has been orchestrated with the OpenStack Ironic component, and is hence managed with the same tools as the CERN internal OpenStack cloud. Experience and benchmarks of MPI applications across Infiniband with shared storage on CephFS is discussed, as well the setup of the SLURM scheduler for HPC jobs with a provision for backfill of HTC workloads.
© The Authors, published by EDP Sciences, 2019
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.