Engineers and scientists working in a heterogeneous computing environment often want to distribute the various tasks of an application – such as computation, storage, analysis, or visualization – to different types of resources and libraries. Often, a technique known as remote procedure call (RPC) is used that allows local calls to be transparently executed onto remote resources.
Using RPC on a high-performance computing system presents two limitations, however: the inability to take advantage of the native network transport and the inability to transfer large amounts of data.
To avoid these limitations, researchers at Argonne National Laboratory, in collaboration with colleagues from the HDF Group in Champaign, Illinois, and from Queen’s University in Canada, have developed Mercury – an RPC interface specifically designed for high-performance computing.
“Mercury was the ancient Roman messenger god,” said Dries Kimpe, assistant computer scientist in Argonne’s Mathematics and Computer Science Division. “We thought the name appropriately indicated the speed with which our new software handles messages and communication.”
Mercury builds on a small, easily ported network abstraction layer, providing operations closely matched to the capabilities of high-performance network environments. Unlike most other RPC frameworks, Mercury directly supports handling remote calls containing large data arguments. Moreover, Mercury’s network protocol is designed to scale to thousands of clients.
The research team demonstrated Mercury’s power on two high-performance computing systems: a QDR InfiniBand cluster and a Cray XE6. In bandwidth tests of bulk data transfers, for example, Mercury showed excellent scalability, with throughput either increasing or remaining stable as the number of clients increases.
Key to Mercury’s design is a generic interface that allows any function call to be transferred, avoiding the limitations of a hard-coded set of routines. Moreover, the network implementation of Mercury is abstracted, thus enabling efficient use of existing native transport mechanisms while allowing easy porting to future systems.
J. Soumagne, D. Kimpe, J. Zounmevo, M. Chaarawi, Q.Koziol, A. Afsahi, and R. Ross, Mercury: Enabling Remote Procedure Call for High-Performance Computing, Cluster 2013.