Managing a heterogeneous scientific computing cluster with cloud-like tools: ideas and experience
Department of Computer Science, Università di Torino,
2 Istituto Nazionale di Fisica Nucleare, Torino, Italy
3 Department of Electronics and Telecommunications, Politecnico di Torino, Italy
4 Scientific Computing Competence Centre (C3S), Università di Torino, Italy
* Corresponding author: email@example.com
Published online: 17 September 2019
Obtaining CPU cycles on an HPC cluster is nowadays relatively simple and sometimes even cheap for academic institutions. However, in most of the cases providers of HPC services would not allow changes on the configuration, implementation of special features or a lower-level control on the computing infrastructure, for example for testing experimental configurations. The variety of use cases proposed by several departments of the University of Torino, including ones from solid-state chemistry, computational biology, genomics and many others, called for different and sometimes conflicting configurations; furthermore, several R&D activities in the field of scientific computing, with topics ranging from GPU acceleration to Cloud Computing technologies, needed a platform to be carried out on. The Open Computing Cluster for Advanced data Manipulation (OCCAM) is a multi-purpose flexible HPC cluster designed and operated by a collaboration between the University of Torino and the Torino branch of the Istituto Nazionale di Fisica Nucleare. It is aimed at providing a flexible and reconfigurable infrastructure to cater to a wide range of different scientific computing needs, as well as a platform for R&D activities on computational technologies themselves. We describe some of the use cases that prompted the design and construction of the system, its architecture and a first characterisation of its performance by some synthetic benchmark tools and a few realistic use-case tests.
© The Authors, published by EDP Sciences, 2019
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.