Blasting Through the 10 PFlops Barrier with HACC on the BG/Q
Remarkable observational advances have established a compelling cross-validated model of the Universe. Yet, two key pillars of this model — dark matter and dark energy — remain mysterious. Sky surveys that map billions of galaxies to explore the "Dark Universe," demand a corresponding extreme-scale simulation capability; the HACC (Hybrid/Hardware Accelerated Cosmology Code) framework has been designed to deliver this level of performance now, and into the future.
With its novel algorithmic structure, HACC allows flexible tuning across diverse architectures, including accelerated and multi-core systems. On the IBM BG/Q, HACC attains unprecedented scalable performance — currently 13.94 PFlops at 69.2 percent of peak and 90 percent parallel efficiency on 1,572,864 cores with an equal number of MPI ranks, and a concurrency of 6.3 million. This level of performance was achieved at extreme problem sizes, including a benchmark run with more than 3.6 trillion particles, significantly larger than any cosmological simulation yet performed. The largest ever cosmological simulation science run is currently underway on Mira.