Evolution of monitoring, accounting and alerting services at INFN-CNAF Tier-1
2 >Università di Bologna, Sede di Cesena, Italy
* e-mail: email@example.com
Published online: 17 September 2019
CNAF is the national center of INFN (Italian National Institute for Nuclear Physics) for IT technology services. The Tier-1 data center operated at CNAF offers computing and storage resources to scientific communities as those working on the four experiments of LHC (Large Hadron Collider) at CERN and other 30 experiments in which INFN is involved. In past years, monitoring and alerting services for Tier-1 resources were performed with several software, such as LEMON (developed at CERN and customized on the char-acteristics of datacenters managing scientific data), Nagios (especially used for alerting purposes) and a system based on Graphite database and other ad-hoc developed services and web pages. By 2015, a task force has been organized with the purpose of defining and deploying a common infrastructure (based on Sensu, InfluxDB and Grafana) to be exploited by the different CNAF depart-ments. Once the new infrastructure was deployed, a major task was then to adapt the whole monitoring and alerting services. We are going to present the steps that the Tier-1 group followed in order to accomplish a full migration, that is now completed with all the new services in production. In particular we will show the monitoring sensors and alerting checks redesign to adapt them to the infrastructure base on the Sensu software, the web dashboards creation for data presentation, the porting of historical data from LEMON/Graphite to InfluxDB.
© The Authors, published by EDP Sciences, 2019
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.