https://doi.org/10.1051/epjconf/202429507003
Analyzing, Identifying & Alerting on Network Issues
1 University of Michigan Physics, Ann Arbor, MI, USA
2 European Organization for Nuclear Research (CERN), Geneva, Switzerland
3 Physics Department, University of Chicago, Chicago, IL, USA
4 Faculty of Mathematics and Informatics, University of Plovdiv, Bulgaria
Published online: 6 May 2024
The Worldwide LHC Computing Grid (WLCG) relies on the network as a critical part of its infrastructure and therefore needs to guarantee effective network usage and prompt detection and resolution of any network issues, including connection failures, congestion, and traffic routing. In this paper, we will describe our ongoing work to proactively analyze, correlate and alert on various network and infrastructure issues. We will discuss the methods and techniques applied, the systems developed, and the challenges with the measurements that make it difficult to easily identify problems or assign those problems to the appropriate location(s).
© The Authors, published by EDP Sciences, 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.