Forecasting the flu | Argonne National Laboratory

March 15, 2021

Researchers from the University of Virginia, Google Inc., and Argonne National Laboratory have teamed together to tackle a long-standing problem in epidemiology: how to predict the spread of infectious diseases.

Fig. 1: Performance comparison of AMM with other mobility models and a baseline model with no mobility. AMM, commute, and radiation models perform similarly and do better than the gravity model and no-mobility baseline.

Their focus is on influenza, which has been a major cause of hospitalizations and deaths for many years, even with the widespread availability of flu vaccines.

Human mobility is well recognized as a key factor in the spread of infectious disease such as the flu. But the availability of high-resolution mobility data is often limited, based on records that fail to capture movement across borders or even continents.

To better understand the effects of such movement, the team created an anonymized mobility map, or AMM, using data collected from hundreds of millions of smartphones that had turned on the “location history option.” The researchers applied machine learning to the anonymized logs data, automatically dividing the data into semantic trips. The trips also were aggregated by using^.a mechanism known as differential privacy, which injects mathematical noise in the data to maintain user anonymity while ensuring that the data remains accurate.

The new study, which was published February 9, 2021, in Nature Communications, reported on data from five New York City boroughs and from eight states in Australia. These areas were chosen because of the availability of so-called ground truth data – information collected on site. In this case, the ground truth data was based on emergency department visits for NYC and lab-tested flu positive counts for Australia.

The AMM was factored into a “metapopulation model,” called PatchSim, which traced the movement of individuals from their “home patches” to “away patches” and documented exposures, infections, and recoveries in the away patches. The simulated epidemic curves were compared with the ground truth, and the model was then used to retroactively forecast flu.

In one experiment, for example, the researchers performed a comparative study in forecasting influenza activity during the 2016–2017 flu season in the five boroughs of NYC. Three other mobility networks – commuter surveys, a gravity model, and a radiation model – as well as a model with no mobility were incorporated into the PatchSim metapopulation framework. The results, shown in Figure 1, indicate that AMM and the commute and radiation models performed similarly and did better than the no-mobility baseline model as well as the gravity model.

The researchers next tested their approach in Australia. This area was selected because it is radically different from New York, with a sparser population spread across a wider area, different flu dynamics, and different weather conditions. The consistent performance results in these two very different regions validated their approach and demonstrated the generality and potential global scope of AMM.

The research team also performed a “leave-one-out” cross-validation study, whereby ground truth data from one of the five New York boroughs was omitted each time PatchSim was calibrated. The calibrated model then was used to predict flu outbreaks in the omitted borough. The results indicated that the model can be used to forecast even in regions where case data is lacking.

Arguably, the study had several limitations. For example, it did not consider the amount of time an individual spent at a location, which could significantly affect the likelihood of transmitting the flu. Nevertheless, the researchers are excited about the potential of AMM, which they believe can help in predictions for areas that cannot invest in timely data surveys. Moreover, its global scope makes AMM a candidate for pandemic preparedness studies as well as rapid risk analysis during an unfolding outbreak.

“Commuter data is not globally available,” said Arindam Fadikar, a postdoctoral appointee in Argonne’s Mathematics and Computer Science Division. “In contrast, the machine-learned mobility data we used is globally available from Google, enabling researchers to predict the spread of infection anywhere in the world. It is one of the very reasons for using such data in infectious disease forecasting,” Fadikar said

For the full paper, see S. Venkatramanan, A., Sadilek, A. Fadikar, et al., “Forecasting influenza activity using machine-learning mobility map,” Nature Communications 12, 726 (2021).