To find weapons to fight the coronavirus, scientists used the nation’s fastest supercomputer to peer inside the intricacies of how the virus reproduces itself.
“Think of it as like a Swiss watch, with precisely organized enzymes and nanomachines that come together like tiny gears to perform this function,” said Arvind Ramanathan, a computational scientist at the U.S. Department of Energy’s (DOE) Argonne National Laboratory and the study’s lead author. “If we could find ways to block or gum up the copying process, we could discover new drugs to attack the virus. But first we had to better understand it.”
The study earned the multi-institutional team a finalist nomination for the Association of Computing Machinery (ACM) Gordon Bell Special Prize for High Performance Computing–Based COVID-19 Research. The prize will be presented at this year’s SC21 conference, Nov. 14–19.
The coronavirus uses a precisely coordinated process known as the replication-transcription complex to reproduce at high speed when it invades a host’s cells. The process essentially transcribes the ribonucleic acid, or RNA, that contains the genetic code for the virus, packages the RNA and pumps out the photocopies of itself to overwhelm the host cells.
“It’s a system of roughly 2 million atoms, and there’s no single way to get a really good look inside,” Ramanathan said. “A number of scientists have done tremendous work to understand some of these individual parts, but nobody had looked at this complex from a broader view to try to understand how they all work together.”
The team used data from cryo-electron microscopy, a technique that flash-freezes molecules and pounds them with electrons to create 3D images, to take a closer look at the molecular machinery. But static images alone wouldn’t be enough to capture the workings of the copying process.
“It’s critical to be able to see the interactions between the nonstructural proteins (which are mostly enzymes) as they process the viral RNA one base at a time,” Ramanathan said. “The molecules move in a complicated pattern, like a waltz. We needed to see that waltz and the machinery in motion in order to understand how to gum up this Swiss watch.”
The team used a hierarchical artificial intelligence (AI) framework running on Balsam, a distributed workflow engine across four of the nation’s top supercomputing systems — Summit, Oak Ridge Leadership Computing Facility’s (OLCF) 200-petaflop flagship computer; Theta, Argonne Leadership Computing Facility’s (ALCF) 15.6-petaflop system; Perlmutter, the National Energy Research Scientific Computing Center’s (NERSC’s) 64.6-petaflop system; and Longhorn, a subsystem of the Texas Advanced Computing Center’s 23.5-petaflop Frontera system — to simulate the process.
With access to the ALCF AI Testbed, the team used a Cerebras wafer-scale engine to train deep learning models that were coupled with the various supercomputing systems. The workflow builds on a strategy employed by Ramanathan and Rommie Amaro, a professor and endowed chair of chemistry and biochemistry at the University of California San Diego, to simulate the behavior of the virus’s spike protein, a study that won last year’s Gordon Bell Special Prize for COVID-19 research.
“By coordinating this work across these sites, we could use all the strengths of the best state-of-the-art computing to perform these simulations,” Ramanathan said. “Everything had to come together in just the right place in just the right way, like an assembly line. These simulations helped fill in the blanks the cryo-electron microscopy couldn’t capture and reconstruct the motion that we couldn’t otherwise understand to reach a biophysically meaningful interpretation.”
The team will share the results of their study at the SC21 conference.
In a second study also nominated for a Gordon Bell Award, Ramanathan teamed up with other researchers to study an aerosolized virus particle. The Argonne team contributed to the development and integration of a novel AI technique called anharmonic conformational analysis-enabled autoencoders (ANCA-AE). This technique analyzed the large volumes of data generated from the collective simulations of the virion and the Delta-spike systems.
This research was supported by the Exascale Computing Project, a collaborative effort of the DOE Office of Science and the National Nuclear Security Administration; the COVID-19 HPC Consortium; and the DOE National Virtual Biotechnology Laboratory, with funding provided by the Coronavirus CARES Act, and the DOE Office of Science’s Advanced Scientific Computing Research program. Support is organized under the Co-Design for Artificial Intelligence and Computing at Scale for Extremely Large, Complex Datasets projects.
The OLCF, ALCF and NERSC are DOE Office of Science user facilities.
Related Publication: Anda Trifan, Defne Gorgun, Zongyi Li, Alexander Brace, Maxim Zvyagin, Heng Ma, Austin Clyde, David Clark, Michael Salim, David J. Hardy, Tom Burnley, Lei Huang, John McCalpin, Murali Emani, Hyenseung Yoo, Jungyi Yin, Aristeidis Tsaris, Vishal Subbiah, Tanveer Raza, Jessica Liu, Noah Trebesch, Geoffrey Wells, Venkatesh Mysore, Thomas Gibbs, James Phillips, S. Chakra Chennubhotla, Ian Foster, Rick Stevens, Anima Anandkumar, Venkatram Vishwanath, John E. Stone, Emad Tajkhorshid, Sarah A. Harris and Arvind Ramanathan. “Intelligent Resolution: Integrating Cryo-EM with AI-driven Multi-resolution Simulations to Observe the SARS-CoV-2 Replication-Transcription Machinery in Action.” To appear in International Journal of High Performance Computing Applications, 2021.
UT-Battelle LLC manages Oak Ridge National Laboratory for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science.
Jared Sagoff contributed to this story.
The Argonne Leadership Computing Facility provides supercomputing capabilities to the scientific and engineering community to advance fundamental discovery and understanding in a broad range of disciplines. Supported by the U.S. Department of Energy’s (DOE’s) Office of Science, Advanced Scientific Computing Research (ASCR) program, the ALCF is one of two DOE Leadership Computing Facilities in the nation dedicated to open science.
Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science.
The U.S. Department of Energy’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science.