Skip to main content
Article | Mathematics and Computer Science

Researchers win best paper award at high-performance distributed computing conference

I need a space of my own!” How many times have we heard that said, or even said it ourselves?

Perhaps not surprising, then, a team of researchers from Argonne National Laboratory, RIKEN, and Intel Corporation have explored the concept with regard to a problem in high-performance computing – namely, sharing address space. Their results, presented at the ACM International Conference on High Performance Distributed Computing (HPDC) June 11–15, 2018, in Tempe, Arizona, received a best paper award.

HPDC is the premier conference for presenting new research relating to high-performance parallel and distributed systems used in both science and industry. The award-winning paper presents a new parallel execution model for many-core CPUs. The model, called process-in-process  or PiP, maps multiple processes into a single virtual address space.

The idea of sharing address space between multiple processes is not new. So why is a new model needed? The answer lies with advances in high-performance computing, notably many-core computers with more parallelism in a node and frequent communication  between processes. Currently two parallel execution models are widely used. The multiprocess model keeps variables private but has the disadvantage that it involves frequent communication among processes. The multithreaded model makes data exchange easy but requires that shared variables be protected.

Unlike these models, PiP’s design is completely in user space,” said Min Si, an assistant computational scientist in Argonne’s Mathematics and Computer Science (MCS) division.” She explained that with PiP each process still owns its process-private storage but can directly access the private storage of other processes in the same virtual address space.

By removing the walls between processes, we gained the advantages of both the other models without the disadvantages,” said Pavan Balaji, a computer scientist in Argonne’s MCS division. And by building a fence between threads, we made the variable private to each thread, with no need for protection on shared variables.”

Another advantage of PiP is that because it has only a small set of functions, it can be integrated with other runtime systems such as MPI and OpenMP. The researchers stress that PiP is not intended to take over MPI or OpenMP, however; rather, it can provide a portable low-level support for users who are not satisfied with the other models and wish to boost communication performance in their applications.

The team evaluated PiP on several platforms, including two high-ranking supercomputers. The results show a 3.2x performance increase in a hybrid particle transport proxy application on 1,024 Knights Landing nodes, and almost a 30% reduced slowdown ratio on the LAMPPS application with in situ analysis.

For the full paper presenting the design, implementation, and evaluation of PiP, see Atsushi Hori, Min Si, Balazs Gerofi, Masamichi Takagi, Jai Dayal, Pavan Balaji, and Yutaka Ishikawa, Process-in-Process: Techniques for Practical Address-Space Sharing,” in HPDC 18 Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing, pp. 131-143.