Reducing and Moving Light Source Data Intelligently
MCS Menu
Light sources such as the Advanced Photon Source (APS) at the U.S. Department of Energy’s Argonne National Laboratory produce highly intense X-ray beams for probing microstructures and dynamic processes. The beams can reach up to terabits per second during light source experiments. Because this immense throughput exceeds the local processing capacity of such facilities, the data typically is offloaded to a high-performance computing platform. Argonne has established a dedicated terabit (a trillion bits) streaming connection between the APS and the Aurora supercomputer at the Argonne Leadership Computing Facility (ALCF). This allows for real-time processing of massive datasets generated by experiments. The APS and ALCF are DOE Office of Science user facilities.
Why is data compression needed if a terabit streaming connection exists?
Data compression is still needed because the data generation speed at advanced light source facilities exceeds the data transfer capacity between the facility and the supercomputing system.
What are the limitations of current data compression methods?
Both lossy and lossless compression methods are widely available, but they often are too slow or unable to distinguish the irregular regions typical of light source data. And techniques involving data partitioning and parallel processing across hundreds of nodes/cores, while producing promising compression ratios, are too expensive and may not scale efficiently.
How are these limitations being addressed?
To address these limitations, a team of scientists at Argonne, the University of Florida and the University of Iowa have developed lsCOMP, a compression technique that efficiently manages light source data within a single GPU kernel. Their research is reported in a paper published in SC’25: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2025.
To evaluate lsCOMP, the team collected thousands of light source images from two key light source applications:
-
CSSI (Coherent Surface Scattering Imaging): Targets nanoscale static structures and is well suited for lossy compression methods.
-
XPCS (X-ray Photon Correlation Spectroscopy): Focuses on the dynamics of materials at the nanoscale, for which lossless compression is more desirable.
For these applications, the team compared lsCOMP with several state-of-the-art GPU compressors on a single NVIDIA GPU: LZ4, ZSTD and Cascaded for lossless and cuZFP, TConv and ZThresh for lossy compressors. Several features of lsCOMP are noteworthy.
High throughput: Ideally, researchers want to achieve compression throughput that is higher than the data generation rate of the advanced light source. The results with lsCOMP were excellent. For example, it achieved throughput up to 20 times faster compared with lossless compressors.
“Compression throughput is considered crucial for light source facilities, speeding up analysis and enabling real-time processing.,” said Yafan Huang, a visiting graduate student in Argonne’s Mathematics and Computer Science (MCS) division from the University of Iowa and first author of the paper.
Data quality: The researchers also evaluated the quality of data compression. Again, the results were impressive. For example, with a of dataset consisting of 600 two-dimensional images and millions of random granular patterns (called speckles) common in light source images, lsCOMP preserved critical data features, while cuZFP had trouble distinguishing high- and low-frequency components, leading to reduced data quality.
“Preserving data quality with compression is a key concern for scientists,” said Sheng Di, .a computational scientist in Argonne’s MCS division and Huang’s co-advisor “Different light source applications have different data quality requirements. lsCOMP showed superior performance in both visualization and speckle analysis.”
Configurability: In light source applications, supporting only lossy or only lossless compression can be a significant drawback. In lsCOMP, however, each compression step is modularized, supporting both lossless and lossy compression.
“lsCOMP provides a flexible range of compression options, from robust lossless methods to lossy techniques that achieve higher compression ratios without sacrificing data integrity,” said Peco Myint, a software engineering specialist in Argonne’s X-ray Science division. “By allowing users to configure these methods based on their specific light source data fidelity and network conditions, we can significantly optimize both resources and transfer times.”
What’s next?
To demonstrate the hardware compatibility of lsCOMP, the research team ran lsCOMP on several different GPU architectures using lsCOMP’s lossless compression strategy. lsCOMP achieved comparable compression throughput on three different GPU architectures.
“One of our next steps will be to design compression algorithms for light source with other heterogeneous processors, such as FPGA and AI chips,” Di said.
For further information, see Y Huang, S Di, R Underwood, P Myint, M Chu, G Li, Nicholas Schwarz, Franck Cappello. lsCOMP: Efficient Light Source Compression, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 2006-2023, 2025, https://dl.acm.org/doi/epdf/10.1145/3712285.3759814
Argonne National Laboratory seeks solutions to pressing national problems in science and technology by conducting leading-edge basic and applied research in virtually every scientific discipline. Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science.
The U.S. Department of Energy’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science.