Using Discrete-Event Simulation for HPC System Co-Design
Abstract: Because production high-performance computing (HPC) systems offer limited opportunities for studying the design of HPC systems, modeling and simulation are becoming popular choices for HPC design space exploration. In particular, parallel discrete-event simulation (PDES) can be used in a number of ways to answer "what-if" design questions.
In this talk, we focus on applying PDES to HPC interconnects, which are a key determinant of application performance. We describe our experiences in using the CODES discrete-event simulation framework to answer three questions:
- When using a dragonfly network, what is the impact of communication interference, job, and data placement policies on application performance?
- What insights does PDES give us into modern interconnect architectures such as dragonfly, fat tree, slim fly, and torus networks?
- How can PDES be used to assess communication algorithms and the impact of network routing algorithms?
The talk will conclude by identifying opportunities through which simulation can assist application users and HPC centers to carry out effective design space exploration.
Bio: Misbah Mubarak is a postdoctoral researcher in the Mathematics and Computer Science Division. Her research focus is modeling and simulation, high-performance networks and data-intensive computing. Misbah received her Ph.D. and Masters in computer science from Rensselaer Polytechnic Institute in 2015 and 2011 respectively.