https://doi.org/10.1051/epjconf/202429501021
Enabling Storage Business Continuity and Disaster Recovery with Ceph distributed storage
CERN, Esplanade des Particules 1, 1211 Geneva 23, Switzerland
* e-mail: enrico.bocchi@cern.ch
Published online: 6 May 2024
The Storage Group in the CERN IT Department operates several Ceph storage clusters with an overall capacity exceeding 100 PB. Ceph is a crucial component of the infrastructure delivering IT services to all the users of the Organization as it provides: i) Block storage for OpenStack, ii) CephFS, used as persistent storage by containers (OpenShift and Kubernetes) and as shared filesystems by HPC clusters and iii) S3 object storage for cloud-native applications, monitoring and software distribution across the WLCG.
The Ceph infrastructure at CERN is being rationalized and restructured to allow for the implementation of a Business Continuity/Disaster Recovery plan. In this paper, we give an overview of how we transitioned from a single cluster providing block storage to multiple ones, enabling Storage Availability zones, and how block storage backups can be achieved. We also illustrate future plans for file systems backups through cback,a restic-based scalable orchestrator, and how S3 implements data immutability and provides a highly available, Multi-Data Centre object storage service.
© The Authors, published by EDP Sciences, 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.