Lecture | Computing, Environment and Life Sciences

Building an End-to-End Data Ecosystem to Support Materials Science Research

CELS Summer Student Lecture

Abstract: As researchers acquire large quantities of scientific data from experiment and simulation, it becomes critically important to provide simple interfaces and services to help individual researchers, research groups, and large facilities automate capture and description of their key datasets. In turn, this will improve accessibility, shareability, reusability, and machine consumption of these digital assets.

In this talk, we highlight three projects: the Materials Data Facility, DLHub, and the ALCF data service. We describe how these data services combine to form an ecosystem leading toward more effective capture and organization of data; discovery and interrogation of relevant data; utilization of leadership-scale computing facilities for analysis and machine learning; and association of machine learning models with data collections for improved reproducibility and simpler deployment at scale.