Earth Systems Analysis on Big Data
Nexus
The Earth Systems Grid Federation (ESGF2) project maintains an enormous collection of environmental science data and an impressive infrastructure for querying, accessing and analyzing that data. ESGF2 manages 7.5 petabytes of data hosted at the ALCF on storage attached to Polaris and other ALCF systems. ESGF and ALCF have been working together to enable analysis of this data using a combination of ESGF tools and Globus.
ESGF interfaces allow users to query for data of interest, reduce the results to a subset of interest, and schedule analysis on DOE facility systems. From within an ESGF web application or a Jupyter notebook users can query and manipulate data in the same manner they’re accustomed to, and the resulting analyses can be exported to run on target systems based on data availability, queue wait time, and other factors as specified by the user. Read more about ESGF2.