Skip to main content
Seminar | Mathematics and Computer Science

Data-Efficient Computing on Wafer-Scale Engine

CS Seminar

Abstract: Scientific simulations and modern artificial intelligence (AI) workloads generate data at a massive scale while running on increasingly specialized accelerators. Limited on-chip memory, expensive communication and complex dataflow patterns create fundamental bottlenecks that restrict performance and scalability.

In this talk, I will present our recent efforts to address these challenges through accelerator-aware lossy compression and domain-specific compiler support for wafer-scale architectures such as the Cerebras Wafer-Scale Engine (WSE). I will first introduce CereSZ, the first error-bounded lossy compressor co-designed with the WSE’s dataflow execution model. I will then describe follow-up work CereSZ2 that addresses load imbalance and offset computation through a fixed-size Huffman encoding scheme optimized for wafer-scale parallelism.

Finally, I will discuss our ongoing work on a domain-specific compression compiler that allows users to specify compression algorithms in high-level Python definitions and automatically generates optimized code for both CPU and WSE backends. Together, these efforts illustrate a broader agenda of algorithm–architecture co-design, aiming to make large-scale scientific and AI workloads more efficient, scalable and accessible across next-generation accelerator platforms.

Bio: Shihui Song is a Ph.D. candidate in Computer Science at the University of Iowa. Since 2023, she has been collaborating with Argonne on lossy compression and domain-specific compiler support for the Cerebras WSE.

See upcoming and previous presentations at: CS Seminar Series.