Abstract: Exponential increases in data volume and velocity are overwhelming finite human capabilities. Continued progress in science and engineering demands that we automate a broad spectrum of manual research data manipulation tasks, from data transfer and sharing to acquisition, publication, and analysis. These needs are particularly evident in large-scale experimental science, in which researchers are typically granted short periods of instrument time and must maximize experiment efficiency as well as output data quality and accuracy.
I will present work to develop infrastructure to support the reliable automation of a broad range of scientific data management and analysis tasks. I will discuss several important components of this infrastructure, including detecting file events on large-scale file systems, enabling user-friendly Trigger-Action programming, supporting reliable and secure remote execution at the Argonne Leadership Computing Facility, and providing a user-oriented automation orchestration platform. Finally, I will describe how these techniques have been applied to a variety of scientific use cases.
Bio: Ryan Chard joined Argonne in 2016 as a Maria Goeppert Mayer Fellow. He focuses on the development of cyberinfrastructure to enable scientific research. In particular, he works on Automate to streamline data analysis pipelines, DLHub to serve machine learning models on demand, and most recently, funcX to enable function serving on high-performance computers. He has a Ph.D. in computer science and an M.S. from Victoria University of Wellington, New Zealand.