Abstract: I/O is a well-established pain point for many scientific applications running on high performance computing (HPC) systems. These systems deploy massive, state-of-the-art storage subsystems which applications leverage for managing their scientific data, typically using an increasingly deep and complex I/O software stack. The complexity of the HPC I/O subsystem can pose a significant challenge to users, who are often ill-equipped for understanding and improving their application I/O workloads, hampering system efficiency and scientific productivity.
In this talk, I will present Darshan, an I/O characterization and analysis tool commonly employed by application users, facility staff, and researchers at HPC centers across the world for better understanding and improving storage access patterns. We will focus on recent advancements to Darshan’s instrumentation capabilities, as well as cover recent efforts in developing a Python-based analysis framework for Darshan log data. We will also cover future directions for Darshan so that it may remain a fundamental tool for HPC I/O understanding as scientific computing continues to evolve.