Cappello delivers a keynote address on data compression

August 9, 2018

Franck Cappello, a senior computer scientist in Argonne’s Mathematics and Computer Science Division, gave a keynote address at the 17^th IEEE International Symposium on Parallel and Distributed Computing.

The meeting, held in June 2018 in Geneva, Switzerland, focused on new approaches for the modeling, design, analysis, evaluation, and programming of future parallel and distributed computing systems and applications.

“Extreme-scale scientific simulations and experiments on scientific instruments are already generating more data that can be communicated stored and analyzed,” said Cappello, “and the data flood will get even worse with future exascale systems and scientific instruments updated with higher-definition sensors.”

The solution is data reduction. But the current approach of dropping data often drastically eliminates data of importance to scientific analysis. Another approach involves compressing the dataset by using lossless compression algorithms, but this approach doesn’t provide enough data reduction for scientific datasets.

“Lossy compression seems the only practical and effective direction,” says Cappello. The technique involves using approximations and partial data discarding to represent the content. In his presentation, Cappello discussed the effectiveness of current state-of-the-art lossy compression algorithms.

He also discussed several questions facing developers of these novel algorithms. What are the right metrics to qualify compression quality? How can we control the compression to match user-set bounds? How do errors injected during the compression affect the following steps of the application?

For information about the ISPD 2018 meeting, see the website http://lsds.hesge.ch/ISPDC2018/ .