Argobots and BOLT: Lightweight Runtime Systems for Massive Fine-Grained Parallelism
Events section menu
Abstract: Over the last decade, exponential performance improvement of processors has been achieved by increasing cores. To achieve high scalability on modern processors, applications must be decomposed and parallelized in a finer-grained manner so that the computation is sufficiently fed into each core. However, as the number of cores increases, parallelization overhead in the runtime system is becoming significant and hinder scalability. Such overheads are often incurred by the heavyweight nature of threads provided by operating systems, while widely used runtime systems still use OS-level threads as parallel units.
To address this issue, we are developing a low-level lightweight threading library called Argobots, which is hundreds of times faster than conventional threads. Argobots expose fine-grained control over threading features, scheduling, and synchronization, which promotes better interoperability with other programming models and trims down the threading overheads. We further investigate a lightweight OpenMP runtime system over Argobots, called BOLT, because most real-world multithreaded high-performance computing applications are parallelized with OpenMP, the most popular intranode parallel programming model. Thanks to its ABI compatibility with leading commercial and open-source OpenMP runtime systems, applications can leverage Argobots lightweight threads via BOLT without modifying or recompiling existing programs. Our evaluation demonstrates that Argobots and BOLT succeed in enhancing performance by exploiting inherent parallelism in several applications.
Bio: Shintaro Iwasaki is a Ph.D. student in the Department of Information and Communication Engineering at the University of Tokyo. He is a graduate researcher at Argonne, working on lightweight threading libraries and their applications. His research interest includes parallel languages, compilers, runtime systems, and scheduling techniques for high-performance computing.