Skip to main content
Seminar | Mathematics and Computer Science

Parallel Programming with Dependent Tasks: Dataflow and Multi-GPU BLAS Libraries

Abstract: Over the last decades, supercomputer architectures have significantly evolved.

We observe an increase and diversification of processing units per compute node, each capable of asynchronous execution. It makes programming portable, performing and correct parallel applications increasingly challenging. To meet this challenge, many HPC parallel programming models adopted dependent tasks interfaces, such as OpenMP (2012), CUDA graph API (2018), MPI partitioned communication (2021) or Level Zero (2024). In this talk, I will (1) briefly introduce fundamental concepts of parallel programming using dependent tasks and dataflow.

Then, I will (2) present in-progress work on XKBlas: a multi-GPU BLAS implementation that relies on dependent tasking with implicit dataflow under the hood. Eventually, I will (3) conclude on research collaboration opportunities.

Bio: Romain PEREIRA is a postdoctoral appointee in the Mathematics and Computer Science (MCS) division at Argonne National Laboratory since Jan. 2025. He completed his PhD in Nov. 2023 at the Commissariat à l’énergie atomique et aux énergies alternatives (CEA) in France on mixing MPI+OpenMP using dependent tasks. He pursued with a postdoc in 2024 at the Laboratoire de l’informatique du parallélisme (LIP) in Lyon, France. His research aims to address challenges of portable, performing and correct parallel programs, with a current focus on multi-gpu architectures.

See all upcoming talks at https://​www​.anl​.gov/​m​c​s​/​l​a​n​s​-​s​e​m​inars.