Zhang, Junchao; Brown, Jed; Balay, Satish; Faibussowitsch, Jacob; Knepley, Matthew; Marin, Oana; Mills, Richard; Munson, Todd; Smith, Barry; Zampini, Stefano
PetscSF, the communication component of the Portable, Extensible Toolkit for Scientific Computation (PETSc), is designed to provide PETScs communication infrastructure suitable for exascale computers that utilize GPUs and other accelerators. PetscSF provides a simple application programming interface (API) for managing common communication patterns in scientific computations by using a star-forest graph representation. PetscSF supports several implementations based on MPI and NVSHMEM, whose selection is based on the characteristics of the application or the target architecture. An efficient and portable model for network and intra-node communication is essential for implementing large-scale applications. The Message Passing Interface, which has been the de facto standard for distributed memory systems, has developed into a large complex API that does not yet provide high performance on the emerging heterogeneous CPU-GPU-based exascale systems. In this article, we discuss the design of PetscSF, how it can overcome some difficulties of working directly with MPI on GPUs, and we demonstrate its performance, scalability, and novel features.