Abstract: In situ workflows that couple multiple applications with streaming data transfer have been proposed to avoid moving data via the shared file system. However, in situ workflows are difficult to configure because of the huge parameter space of possible data transfer settings and the complexity of modern applications and systems. In this talk, we propose an empirical model-based approach with active and transfer learning techniques to automatically optimize the parameter configuration of generic in situ workflows given a database of isolated single- application runs. We show that our algorithms are effective for auto-tuning in situ workflows under constrained cost, that is, some fixed budget of time allocated to new configuration testing. As a by-product of this effort, we also report on the auto-tuner system ATA to measure the performance of configurations and search for the best one, based on the parallel programming language Swift/T.
LANS Informal Seminar