https://doi.org/10.1051/epjconf/202125102037
CERN Tape Archive: a distributed, reliable and scalable scheduling system
1 CERN—European Organization for Nuclear Research, 1211 Geneva 23, Switzerland
2 Institute for High Energy Physics named by A.A. Logunov of National Research Center “Kurchatov Institute”, Nauki Square 1, Protvino, Moscow region, Russia, 142281
* e-mail: eric.cano@cern.ch,vladimir.bahyl@cern.ch,cedric.caffy@cern.ch,german.cancio.melia@cern.ch,michael.davis@cern.ch,julien.leduc@cern.ch,oliver.keeble@cern.ch,steven.murray@cern.ch
** e-mail: viktor.kotliar@ihep.ru
Published online: 23 August 2021
The CERN Tape Archive (CTA) provides a tape backend to disk systems and, in conjunction with EOS, is managing the data of the LHC experiments at CERN.
Magnetic tape storage offers the lowest cost per unit volume today, followed by hard disks and flash. In addition, current tape drives deliver a solid bandwidth (typically 360MB/s per device), but at the cost of high latencies, both for mounting a tape in the drive and for positioning when accessing non-adjacent files. As a consequence, the transfer scheduler should queue transfer requests before the volume warranting a tape mount is reached. In spite of these transfer latencies, user-interactive operations should have a low latency.
The scheduling system for CTA was built from the experience gained with CASTOR. Its implementation ensures reliability and predictable performance, while simplifying development and deployment. As CTA is expected to be used for a long time, lock-in to vendors or technologies was minimized.
Finally, quality assurance systems were put in place to validate reliability and performance while allowing fast and safe development turnaround.
© The Authors, published by EDP Sciences, 2021
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.