Abstract: The slowdown of Moore’s Law and the growth of compute-intensive workloads such as artificial intelligence has pushed the development of high-performance computing (HPC) toward accelerator-based systems. Among supercomputers in the Top 500 list, the share of accelerator FLOPs has grown from 20% in 2010 to 76% in 2018. As part of this trend, many libraries and application codes for solving partial differential equations (PDEs) have been ported and adapted to run on accelerator hardware such as graphical processing units (GPUs). Consequently, it has now become imperative for PDE-constrained optimization algorithms to also live on and exploit the capabilities of the same hardware used by the underlying PDE solvers.
In this work, we launch an investigation into the use of quasi-Newton (QN) methods on GPUs. QN approximations are among the most popular gradient-based kernels for solving large-scale nonlinear systems of equations and are widely used for both continuous optimization and PDE solutions. We implement both matrix-free and compact dense representations of popular QN methods in PETSc/TAO and leverage PETSc data structure abstractions to profile QN performance on both CPUs and GPUs using MPI and ViennaCL backends.