Skip to main content
Seminar | Mathematics and Computer Science

Democratizing High Performance Computing I/O Insights Using Large Language Models Agents

CS Seminar
Abstract: Modern high performance computing (HPC) systems are deployed with massive distributed storage subsystems to meet the demands of data-intensive applications. Due to their scale, these systems are difficult to understand and optimize, but they produce enormous volumes of logging data that could reveal useful insights.
 
However, deriving insights from such logs requires tailored analysis pipelines, including multiple data gathering, processing and visualization steps. Implementing such pipelines requires technical expertise and significant human effort, posing significant bottlenecks to the amount and depth of insights which can be generated.
 
Given the recent progress of large language models (LLMs), they present a promising avenue to remove the manpower bottlenecks by automating the most time and effort intensive steps of the process, thereby democratizing the generation of insights from large collections of job-level logs. To this end, we propose LDBAgent, an LLM agent capable of implementing entire analysis pipelines, including database querying, data processing and visualization from a single user prompt.
 
To evaluate the capabilities of LDBAgent, we also implement LDBBench, a micro-benchmark composed of realistic analysis tasks extracted from research papers. Our preliminary results show that LDBAgent is capable of reliably implementing nearly all LDBBench tasks without additional user input, positioning LDBAgent as a powerful assistant for system administrators as well as users.
 
Bio: Chris Egersdoerfer is a Ph.D. student in Computer Science at the University of Delaware. He is currently interning at Argonne as a W.J. Cody associate, hosted by Orcun Yildiz.
 
See upcoming and previous presentations at CS Seminar Series.