Open Access
March 2024 Generative machine learning methods for multivariate ensemble postprocessing
Jieyu Chen, Tim Janke, Florian Steinke, Sebastian Lerch
Author Affiliations +
Ann. Appl. Stat. 18(1): 159-183 (March 2024). DOI: 10.1214/23-AOAS1784


Ensemble weather forecasts based on multiple runs of numerical weather prediction models typically show systematic errors and require postprocessing to obtain reliable forecasts. Accurately modeling multivariate dependencies is crucial in many practical applications, and various approaches to multivariate postprocessing have been proposed where ensemble predictions are first postprocessed separately in each margin and multivariate dependencies are then restored via copulas. These two-step methods share common key limitations, in particular, the difficulty to include additional predictors in modeling the dependencies. We propose a novel multivariate postprocessing method based on generative machine learning to address these challenges. In this new class of nonparametric data-driven distributional regression models, samples from the multivariate forecast distribution are directly obtained as output of a generative neural network. The generative model is trained by optimizing a proper scoring rule, which measures the discrepancy between the generated and observed data, conditional on exogenous input variables. Our method does not require parametric assumptions on univariate distributions or multivariate dependencies and allows for incorporating arbitrary predictors. In two case studies on multivariate temperature and wind speed forecasting at weather stations over Germany, our generative model shows significant improvements over state-of-the-art methods and particularly improves the representation of spatial dependencies.

Funding Statement

The research leading to these results has been done within the Young Investigator Group “Artificial Intelligence for Probabilistic Weather Forecasting” and funded by the Vector Stiftung. In addition, this project has received funding from the KIT Center for Mathematics in Sciences, Engineering and Economics under the seed funding programme.


We thank Nina Horat, Benedikt Schulz and Tilmann Gneiting for helpful discussions, Sam Allen for providing code for the weighted multivariate scoring rules and three anonymous reviewers for insightful and constructive comments.


Download Citation

Jieyu Chen. Tim Janke. Florian Steinke. Sebastian Lerch. "Generative machine learning methods for multivariate ensemble postprocessing." Ann. Appl. Stat. 18 (1) 159 - 183, March 2024.


Received: 1 September 2022; Revised: 1 May 2023; Published: March 2024
First available in Project Euclid: 31 January 2024

Digital Object Identifier: 10.1214/23-AOAS1784

Keywords: ensemble postprocessing , Generative machine learning , multivariate postprocessing , probabilistic forecasting , weather forecasting

Rights: Copyright © 2024 Institute of Mathematical Statistics

Vol.18 • No. 1 • March 2024
Back to Top