Researchers use data collected in the early stages of an experiment or from similar past experiments and/or data simulated for an upcoming experiment to train machine learning (ML) models. These models learn the specific characteristics of the training data and are then applied to process subsequent data more efficiently. A key challenge is training ML models quickly enough to accommodate the extremely high data rates at modern synchrotron and X-ray free-electron laser light source beamlines.
A team of Argonne researchers, led by DSL’s Zhengchun Liu, received the Best Paper Award at the 3rd Workshop on Extreme-Scale Experiment-in-the-Loop Computing (XLOOP) at the Supercomputing 2021 (SC21) conference, held in St. Louis in November. Their paper, entitled “Bridging Data Center AI Systems with Edge Computing for Actionable Information Retrieval,” describes how specialized data center AI (DCAI) systems can be used to quickly train models through a geographically distributed workflow. Experiments show that, although there are data movement costs and service overhead to use remote DCAI systems for deep neural network training, the turnaround time is still less than 1/30 the cost for using a locally deployable GPU.
Authors of the paper include Zhengchun Liu, Ahsan Ali, Peter Kenesei, Antonino Miceli, Hemant Sharma, Nicholas Schwarz, Dennis Trujillo, Hyunseung Yoo, Ryan Coffee, Naoufal Layad, Jana Thayer, Ryan Herbst, Chun Hong Yoon, Ian Foster.