Abstract
Ensemble weather forecasts based on multiple runs of numerical weather prediction models typically show systematic errors and require postprocessing to obtain reliable forecasts. Accurately modeling multivariate dependencies is crucial in many practical applications, and various approaches to multivariate postprocessing have been proposed where ensemble predictions are first postprocessed separately in each margin and multivariate dependencies are then restored via copulas. These two-step methods share common key limitations, in particular, the difficulty to include additional predictors in modeling the dependencies. We propose a novel multivariate postprocessing method based on generative machine learning to address these challenges. In this new class of nonparametric data-driven distributional regression models, samples from the multivariate forecast distribution are directly obtained as output of a generative neural network. The generative model is trained by optimizing a proper scoring rule, which measures the discrepancy between the generated and observed data, conditional on exogenous input variables. Our method does not require parametric assumptions on univariate distributions or multivariate dependencies and allows for incorporating arbitrary predictors. In two case studies on multivariate temperature and wind speed forecasting at weather stations over Germany, our generative model shows significant improvements over state-of-the-art methods and particularly improves the representation of spatial dependencies.
Funding Statement
The research leading to these results has been done within the Young Investigator Group “Artificial Intelligence for Probabilistic Weather Forecasting” and funded by the Vector Stiftung. In addition, this project has received funding from the KIT Center for Mathematics in Sciences, Engineering and Economics under the seed funding programme.
Acknowledgments
We thank Nina Horat, Benedikt Schulz and Tilmann Gneiting for helpful discussions, Sam Allen for providing code for the weighted multivariate scoring rules and three anonymous reviewers for insightful and constructive comments.
Citation
Jieyu Chen. Tim Janke. Florian Steinke. Sebastian Lerch. "Generative machine learning methods for multivariate ensemble postprocessing." Ann. Appl. Stat. 18 (1) 159 - 183, March 2024. https://doi.org/10.1214/23-AOAS1784
Information