https://doi.org/10.1051/epjconf/202429511025
XkitS:A computational storage framework for high energy physics based on EOS storage system
1 Institute of High Energy Physics, CAS, 100049 Beijing, China
2 University of Chinese Academy of Sciences, 100049 Beijing, China
3 Tianfu Cosmic Ray Research Center, Institute of High Energy Physics, Chinese Academy of Sciences, 610041 Chengdu, China
* e-mail: chengys@ihep.ac.cn
** e-mail: biyujiang@ihep.ac.cn
Published online: 6 May 2024
Large-scale high-energy physics experiments generate scientific data at the scale of petabytes or even exabytes, requiring high-performance data IO for processing. However, in large computing centers, computing and storage devices are typically separated. Large-scale data transfer has become a bottleneck for some data-intensive computing tasks, such as data encoding and decoding, compression, sorting, etc. The time spent on data transfer can account for 50% of the entire computing task. The larger the amount of data accessed, the more significant this cost becomes. One attractive solution to address this problem is to offload a portion of data processing to the storage layer. However, modifying traditional storage systems to support computation offloading is often cumbersome and requires a broad understanding of their internal principles. Therefore, we have designed a flexible software framework called XkitS, which builds a computable storage system by extending the existing storage system EOS. This framework is deployed on the EOS FTS storage server and offloads computational tasks by invoking the computing capabilities (CPU, FPGA, etc.) on FTS. Currently, it has been tested and applied in the data processing of the Large High Altitude Air Shower Observatory (LHAASO), and the results show that the time spent on data decoding using the computable storage technology is half of that using the original method.
© The Authors, published by EDP Sciences, 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.