Open Access
June 2009 On-line predictive linear regression
Vladimir Vovk, Ilia Nouretdinov, Alex Gammerman
Ann. Statist. 37(3): 1566-1590 (June 2009). DOI: 10.1214/08-AOS622


We consider the on-line predictive version of the standard problem of linear regression; the goal is to predict each consecutive response given the corresponding explanatory variables and all the previous observations. The standard treatment of prediction in linear regression analysis has two drawbacks: (1) the classical prediction intervals guarantee that the probability of error is equal to the nominal significance level ɛ, but this property per se does not imply that the long-run frequency of error is close to ɛ; (2) it is not suitable for prediction of complex systems as it assumes that the number of observations exceeds the number of parameters. We state a general result showing that in the on-line protocol the frequency of error for the classical prediction intervals does equal the nominal significance level, up to statistical fluctuations. We also describe alternative regression models in which informative prediction intervals can be found before the number of observations exceeds the number of parameters. One of these models, which only assumes that the observations are independent and identically distributed, is popular in machine learning but greatly underused in the statistical theory of regression.


Download Citation

Vladimir Vovk. Ilia Nouretdinov. Alex Gammerman. "On-line predictive linear regression." Ann. Statist. 37 (3) 1566 - 1590, June 2009.


Published: June 2009
First available in Project Euclid: 10 April 2009

zbMATH: 1160.62065
MathSciNet: MR2509084
Digital Object Identifier: 10.1214/08-AOS622

Primary: 62G08 , 62J05
Secondary: 60G25 , 68Q32

Keywords: Gauss linear model , independent identically distributed observations , Multivariate analysis , on-line protocol , prequential statistics

Rights: Copyright © 2009 Institute of Mathematical Statistics

Vol.37 • No. 3 • June 2009
Back to Top