Argonne researcher Julie Bessac and her French colleague Philippe Naveau agree – and they have undertaken a study to perhaps ease the difficulty in weather prediction evaluation. Their paper, titled “Forecast evaluation with imperfect observations and imperfect models,” focuses on new quality assessment metrics, scoring rules, to account for errors in observations and forecasts.
Classical scoring schemes typically involve comparing different forecasts with observations. But such observations almost always have errors – due, for example, to data-recording problems or instrument deficiencies. Indeed, a recent study showed that the classical logarithm score used by such schemes is misleading in selecting the best forecast when observation errors are present, and that the probabilistic distribution of the verification data should depend on modeling underlying physical processes that are not observed.
Building on the results from that study, Bessac and Naveau have proposed a new scoring model that couples forecasts and observation distributions to correct a score when errors are present in the verification data and in the forecast. They have also highlighted the need to investigate further statistics than the mean score that is commonly used in practice.
The team compared/formulated their new approach with two popular models. The first model helps in understanding the role and impact of observational errors with respect to the non-observed true state of the atmosphere X, but it does not incorporate the idea of forecast error. In the second model both observations Y and forecasts Z are modeled as versions with errors of the state of the atmosphere X, which is again not observed.
“Distinguishing between the unobserved truth (hidden processes) and the observed (but incorrect) verification data is fundamental to understanding the impact of imperfect observations on forecast modeling,” said Bessac, as assistant computational statistician in the Mathematics and Computer Science Division at Argonne.
The new model offers several advantages: (1) it proposes a simple framework to account for errors in the verification data and in the forecast; (2) it highlights the importance of exploring the distribution of scores instead of focusing on the mean only; and (3) it shows the importance of accounting for error in the verification data that can be potentially misleading.
The model was tested on two cases where the parameters of the involved distributions are assumed to be known. While these were idealized cases, the researchers emphasized that the test results highlight the importance of investigating the distribution of scores when the verification data is considered to be a random variable.
For further details, see the paper “Forecast evaluation with imperfect observations and imperfect models,” by Philippe Naveau and Julie Bessac, at https://arxiv.org/pdf/1806.03745.pdf.