Accurately diagnosing patients and predicting their risk for complex diseases is hard. Artificial intelligence (AI) has the potential to help doctors perform these tasks with greater speed and reliability.
Using AI, one can generate models that health care providers can use to predict patients’ risk for heart disease, cancer and various other conditions. But to make AI models accurate, they must be trained using data from multiple providers.
“AI models can leak data. This means that someone can take a model you’ve developed and reconstruct the data that the model was trained with. This becomes a huge problem when you’re dealing with data that is sensitive and protected,” said Argonne computer scientist Ravi Madduri.
While health care generates vast amounts of data year after year, most of it isn’t available because of the need to protect identifiable patient information. With limited data access, AI models often aren’t as reliable in the real world, limiting how they can be used within healthcare.
To expand AI applications while still protecting patient data, the U.S. Department of Energy (DOE) has committed $1 million toward a one-year collaborative research project entitled “PALISADE-X: Privacy Preserving Analysis and Learning in Secure and Distributed Enclaves and Exascale Systems.” The goal of the project is to create a secure AI framework that enables health care organizations to improve AI models used in biomedicine while keeping sensitive data secure.
DOE’s Argonne National Laboratory is leading the project in collaboration with DOE’s Lawrence Livermore National Laboratory (LNLL), the University of Chicago, the Broad Institute and Massachusetts General Hospital. Researchers will also collaborate with the National Institutes of Health (NIH) to create a framework that supports Bridge2AI, an NIH program that’s developing new datasets that can be used with AI to improve healthcare.
“Our ultimate hope is to safely expand our ability to use AI models and high performance computing to further advance the field of biomedicine,” said Argonne computer scientist Ravi Madduri.
Today, AI has limited applications within health care and biomedicine because these techniques put data at risk for exposure.
“AI models can leak data. This means that someone can take a model you’ve developed and reconstruct the data that the model was trained with. This becomes a huge problem when you’re dealing with data that is sensitive and protected,” said Madduri.
Within biomedicine, data used to train models can include information that is considered protected according to HIPAA regulations (like a patient’s sex, age, race, etc.). So, to preserve privacy, organizations have avoided sharing their AI models or the data used to train them, and instead train their models using the limited data available to them internally. But using this approach, organizations are at risk of creating models that have bias, Madduri pointed out.
“Models are as biased as your data, and AI models that carry a lot of bias are not very effective in real world situations,” he said.
PALISADE-X will deliver a framework that can allow organizations to train AI models using data across multiple organizations, all while keeping protected data secure. The framework will be developed as a software package for processing and securing data along with advanced algorithms for federated learning, a form of machine learning that enables multiple organizations to collaboratively train a single model.
“In these algorithms, we’ll be incorporating differential privacy — state-of-the-art statistical techniques that can ensure privacy when you have multiple institutions training a model,” said Kibaek Kim, a computational mathematician at Argonne.
Kim and his team will develop the secure federated learning algorithms for the framework. The framework will also be integrated with AI and supercomputing resources and expertise at both Argonne and LNLL, which will enable researchers to train models more rapidly.
Once built, the PALISADE-X team will demonstrate the efficacy of their framework by using AI models that predict the severity of COVID-19, leveraging public and private biomedical datasets. Researchers also plan to use the framework to predict the risk of developing cardiovascular diseases.
If successful, this work will make it possible for organizations to develop and confidently share their AI models with scientists and relevant research groups, all without the worry of leaking private information.
“When we can safely share more data, we can create better models that have less bias. And when these tools are put in the hands of care providers, it can fundamentally change how medicine is practiced,” Madduri concluded.
This project is sponsored by the Office of Advanced Scientific Computing Research within DOE’s Office of Science.
Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science.
The U.S. Department of Energy’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science.