Argonne National Laboratory

Upcoming Events

Inspector-Executor Load Balancing Algorithms for Block-Sparse Tensor Contractions

David Ozog, University of Oregon
December 14, 2012 10:30AM to 11:30AM
Building 240, Room 4301
Efficient load balancing methods are required to obtain scalability in many scientific software applications. One such application is NWChem's coupled-cluster module, which allows for detailed study of chemical problems by iteratively solving the Schrodinger equation with an accurate ansatz. In this case, relevant task information can be obtained just before execution with negligible cost, which suggests a static mapping of task groups to processors can be a simple and more efficient alternative to centralized dynamic load balancing.

The distributed tensor contractions are block sparse, and an a priori inspection can quickly assign cost estimations to tasks based on characteristics such as their dimensions. Architecture-specific and empirically driven performance models of the dominant SORT and DGEMM routines serve as a cost estimator for a once-per-simulation static partitioning process. This inspector/executor technique has been demonstrated, improving the NWChem coupled-c luster module’s execution time by as much as 50% at scale. The technique is applicable to any scientific application requiring load balance where performance models or estimations of kernel execution times are available.