The award recognizes Cappello “for pioneering contributions and inspiring leadership in distributed computing, high-performance computing, resilience, and data reduction.”
Cappello has a long history of groundbreaking work in parallel and distributed computing. His contributions range from theoretical modeling and optimization to novel algorithm design and high-performance implementations to production-quality software. His most recent research has focused on data compression.
“I am deeply honored to receive the Charles Babbage Award,” Cappello said. “I have always found it challenging – and rewarding – to devise ways to increase the productivity of scientists, whether through the analysis of data and code to uncover their properties, the design of innovative approaches to open computing problems, the development of experimental infrastructures and software technology or through the establishment of missing methodologies.”
Cappello’s early work includes Grid’5000, a large-scale, flexible experimental testbed for parallel and distributed computing. In the early 2000s, distributed systems experiments were difficult, if not impossible, to reproduce on large-scale distributed infrastructures. To address this issue, Cappello led from 2003 to 2008 the development of the first large-scale deeply reconfigurable, controllable and measurable experimental infrastructure for distributed and parallel systems research capable of reproducing experimental conditions. The fact that Grid’5000 served 6,000 users and continues to be widely used with a vibrant community of over 500 researchers, 20 years after its initiation, and that it has produced more than 2,000 publications and 300 theses makes it an astonishing research contribution.
Cappello also has an outstanding track record of leadership in resilience strategies. His research on resilience spans most of the research topics in this domain: (i) failure characterization, modeling and prediction, (ii) fault-tolerant protocols, (iii) checkpointing environment, (iv) checkpointing scheduling and (v) detection of silent data corruptions, focusing on performance and energy optimization. Cappello and his teams has produced more than 100 publications on the resilience topic, covering fundamental contributions, algorithms and software. In 2019, Cappello received an R&D100 award for his contribution to checkpointing technologies, and in 2022, he received the HPDC Achievement Award for “pioneering contributions in methods, tools, and testbeds for resilient high-performance parallel and distributed computing.”
In addition to these achievements, Cappello has made fundamental contributions to the field of lossy compression for scientific data – an area critical to the success of exascale computers and next-generation scientific instruments. For this field, his research covered the complementary aspects of data analysis, algorithm design, software engineering, performance optimization and methodology development. For example, he directed the development of SZ, the first multialgorithm and customizable lossy compression framework, and one of the leading lossy compressors for scientific data. He also initiated and co-developed SDRBench, a set of reference scientific datasets, routinely used by lossy compression researchers. Cappello and his team received an R&D100 award in 2021 for SZ3.
As a further mark of international impact for his research in parallel computing, Cappello has been named the recipient of the 2024 Euro-Par Achievement Award for “outstanding merit in parallel computing and an allegiance to the Euro-Par conference series.”
While the Babbage Award first and foremost recognizes significant contributions in parallel computation, it expresses “the hope that the winner will have also contributed to the parallel computation community through teaching, mentoring, or community service.” And indeed, Cappello’s outstanding research achievements are complemented by a strong record of professional community service. He has been chair or co-chair of major conference committees such as SC, Cluster, and High-Performance Distributed Computing. In 2018, Cappello was awarded the IEEE TCPP Award for Outstanding Service. In 2021, he received the IEEE Transactions on Computers (TC) Award for Editorial Service and Excellence for his role as associate editor of TC.
Cappello noted that Charles Babbage was always experimenting with new ideas to transform computing. “We are living in a fantastic period in science where AI will redefine how research is done and transform computing. Parallel computing plays a critical role here in making large model training and inference practical. How parallel computing can accelerate further and reduce the power consumption of large AI model training and inference and how AI can transform parallel computing are exciting questions.” Cappello said.
For Cappello, questions like these make the high-performance parallel computing field fascinating, and he works hard to convey that passion to other researchers and engage their collaboration in the search for high-performance computing solutions.
The Babbage Award, which consists of a certificate and a $1,000 honorarium, will be presented at the IEEE International Parallel and Distributed Processing Symposium in San Francisco in May 2024.
The IEEE CS Charles Babbage award was established in 2016 in memory of Charles Babbage, an English mathematician, philosopher, inventor and mechanical engineer who originated the concept of a programmable computer. For more information about the award, see the IEEE Computer Society website.