Accelerating Science Impact through Big Data Workflow Management and Supercomputing
1 Physics Department, University of Texas at Arlington, 502 Yates Street, Arlington, TX 76019-0059, USA
2 Physics Department, Brookhaven National Laboratory, Upton, Long Island, New York 11973, USA
3 Kurchatov complex of NBIC-technologies, National Research Centre Kurchatov Institute, 1 Akademika Kurchatova pl., Moscow 123182, Russia
4 Laboratory of Information Technologies, Joint Institute for Nuclear Research, Dubna, Moscow region, 141980, Russia
a e-mail: Ruslan.Mashinistov@cern.ch
Published online: 9 February 2016
The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. ATLAS, one of the largest collaborations ever assembled in the the history of science, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. To manage the workflow for all data processing on hundreds of data centers the PanDA (Production and Distributed Analysis)Workload Management System is used. An ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF), is realizing within BigPanDA and megaPanDA projects. These projects are now exploring how PanDA might be used for managing computing jobs that run on supercomputers including OLCF’s Titan and NRC-KI HPC2. The main idea is to reuse, as much as possible, existing components of the PanDA system that are already deployed on the LHC Grid for analysis of physics data. The next generation of PanDA will allow many data-intensive sciences employing a variety of computing platforms to benefit from ATLAS experience and proven tools in highly scalable processing.
© Owned by the authors, published by EDP Sciences, 2016
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.