Building a scalable deep neural network code called the CANcer Distributed Learning Environment (CANDLE)
With the launch of the “Cancer Moonshot” initiative in 2015, all U.S. government agencies were requested to propose ways to bring their resources and capabilities to forward cancer research. As part of this initiative, the Department of Energy (DOE) entered into a partnership with the National Cancer Institute (NCI) of the National Institutes of Health (NIH). DOE national laboratories are drawing on their strengths in HPC, machine learning, and data analytics, and coupling those to the domain strengths of the NCI in areas such as cancer biology and cancer healthcare delivery, to bring the full promise of exascale computing to the problem of cancer and precision medicine. This partnership identified three key challenges that the combined resources of DOE and NCI can accelerate.
The “drug response problem” (Pilot1) aims to develop predictive models for drug response that can be used to optimize pre-clinical drug screening and drive precision medicine-based treatments for cancer patients. The “RAS pathway problem” (Pilot2) aims to understand the molecular basis of key protein interactions in the RAS/RAF pathway that is present in 30% of cancers. The “treatment strategy problem” (Pilot3) aims to automate the analysis and extraction of information from millions of cancer patient records to determine optimal cancer treatment strategies across a range of patient lifestyles, environmental exposures, cancer types, and healthcare systems.
Through the Exascale Computing Project (ECP), The Exascale Deep Learning and Simulation Enabled Precision Medicine for Cancer application development project focuses on the machine learning aspect of the three problems, and, in particular, builds a single scalable deep neural network infrastructure called CANDLE (CANcer Distributed Learning Environment) that can be used to address all three challenges.
CANDLE addresses three top challenges of the NCI: developing predictive models for drug response, understanding the molecular basis of key protein interactions, and automating the analysis and extraction of information from millions of cancer patient records to determine optimal cancer treatment strategies.