Min Si has received the 2016 NERSC Award for Innovative Use of HPC in the Early Career category for her work on “novel system software in the context of MPI-3 one-sided communication.” The award is presented annually by the National Energy Research Scientific Computing Center to recognize extraordinary scientific achievement from NERSC users and to encourage the innovative use of NERSC’s high-performance computing and data systems.
Min is a Ph.D. student at the University of Tokyo and has worked in Argonne’s Mathematics and Computer Science Division as a guest graduate student. She will rejoin the division in May as an Enrico Fermi postdoctoral fellow.
Min’s research tackles the question of how to handle complex one-sided communication semantics. With traditional thread-based approaches, every process creates a separate background “helper thread”; the disadvantage of this approach is that it can waste up to 50% of computing cores. An alternative interrupt-based approach assumes all hardware resources are busy and uses hardware interrupts to “awaken” a kernel thread to handle an incoming message; the overhead of frequent interrupts can make this approach costly.
Min’s solution is Casper – a process-based approach that dedicates an arbitrary number of cores to so-called ghost processes.
“Since more and more cores are being embedded into advanced computing systems, some of these cores may not always be busy performing computation,” said Min. “Hence it may be more efficient to allow users to dedicate some of them to perform progress on asynchronous communication.”
Experiments have proved her correct. For example, in a study of a widely used quantum chemistry application involving a large water problem, Casper always showed consistent improvement with increasing numbers of cores.
The award was officially announced at the NERSC user group meeting on March 22 at Lawrence Berkeley National Laboratory. Min also gave a remote presentation on the science and computational aspects of her work.
The work has also been highlighted on the U.S. Department of Energy ASCR Discovery website: http://ascr-discovery.science.doe.gov/2016/03/ghost-writer/.
For details about the performance improvement achieved by Casper on the NWChem application on NERSC’s Edison supercomputer, see the paper by M. Si, A. J. Pena, J. Hammond, P. Balaji, and Y. Ishikawa, “Scaling NWChem with Efficient and Portable Asychronous Communication in MPI RMA,” in CCGrid 2015, Shenzhen, 2015.
For a description of the general concept of Casper, see the paper by M. Si, A. J. Pena, J. Hammond, P. Balaji, M. Takagi, and Y. Ishikawa, “Casper: An Asynchronous Progress Model for MPI RMA on Many-Core Architectures,” in 29th IEEE International Parallel & Distributed Processing Symposium, Hyderabad, India, 2014.