The Annals of Statistics

Adaptive estimation in autoregression or -mixing regression via model selection

Y Baraud, F. Comte, and G. Viennet

Full-text: Open access


We study the problem of estimatingsome unknown regression function in a $\beta$-mixing dependent framework. To this end, we consider some collection of models which are finite dimensional spaces. A penalized least-squares estimator (PLSE) is built on a data driven selected model among this collection. We state non asymptotic risk bounds for this PLSE and give several examples where the procedure can be applied (autoregression, regression with arithmetically $\beta$-mixing design points, regression with mixing errors, estimation in additive frameworks, estimation of the order of the autoregression). In addition we show that under a weak moment condition on the errors, our estimator is adaptive in the minimax sense simultaneously over some family of Besov balls.

Article information

Ann. Statist., Volume 29, Issue 3 (2001), 839-875.

First available in Project Euclid: 24 December 2001

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G08: Nonparametric regression
Secondary: 62J02.

Nonparametric regression least-squares estimator model selection adaptive estimation autoregression order additive framework time series mixing processes


Baraud, Y; Comte, F.; Viennet, G. Adaptive estimation in autoregression or -mixing regression via model selection. Ann. Statist. 29 (2001), no. 3, 839--875. doi:10.1214/aos/1009210692.

Export citation


  • Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Proceedings of the 2nd International Symposium on Information Theory (P. N. Petrov and F. Csaki, eds.) 267-281. Akademia Kiado, Budapest.
  • Akaike, H. (1984). A new look at the statistical model identification. IEEE Trans. Automatic Control 19 716-723.
  • Baraud, Y. (1998). S´election de mod eles et estimation adaptative dans diff´erents cadres de r´egression. Ph.D. thesis, Univ. Paris-Sud.
  • Baraud, Y. (2000). Model selection for regression on a fixed design. Probab. Theory Related Fields 117 467-493.
  • Baraud, Y. (2001). Model selection for regression on a random design. Preprint 01-10, DMA, Ecole Normale Sup´erieure, Paris.
  • Barron, A. R. (1991). Complexity regularization with application to artificial neural networks. In Proceedings of the NATO Advanced Study Institute on Nonparametric Functional Estimation (G. Roussas, ed.) 561-576. Kluwer, Dordrecht.
  • Barron, A. R. (1993). Universal approximation bounds for superpositions of a sigmoidal function processes. IEEE Trans. Inform. Theory 39 930-945.
  • Barron, A., Birg´e, L. and Massart, P. (1999). Risks bounds for model selection via penalization. Probab. Theory Related Fields 113 301-413.
  • Barron, A. R. and Cover, T. M. (1991). Minimum complexity density estimation. IEEE Trans. Inform. Theory 37 1034-1054.
  • Berbee, H. C. P. (1979). Random walks with stationary increments and renewal theory. Math. Centre Tract 112. Math. Centrum, Amsterdam.
  • Birg´e, L. and Massart, P. (1997). From model selection to adaptive estimation. In Festschrift for Lucien Lecam: Research Papers in Probability and Statistics (D. Pollard, E. Torgensen and G. Yangs, eds.) 55-87. Springer, New York.
  • Birg´e, L. and Massart, P. (1998). Exponential bounds for minimum contrast estimators on sieves. Bernoulli 4 329-375.
  • Cohen, A., Daubechies, I. and Vial, P. (1993). Wavelet and fast wavelet transform on an interval. Appl. Comp. Harmon. Anal. 1 54-81.
  • Daubechies, I. (1992). Ten Lectures on Wavelets. SIAM, Philadelphia.
  • Devore, R. A. and Lorentz, C. G. (1993). Constructive Approximation. Springer, New York.
  • Donoho, D. L. and Johnstone, I. M. (1998). Minimax estimation via wavelet shrinkage. Ann. Statist. 26 879-921.
  • Doukhan, P. (1994). Mixing properties and Examples. Springer, New York.
  • Doukhan, P., Massart, P. and Rio, E. (1995). Invariance principle for absolutely regular empirical processes. Ann. Inst. H. Poincar´e Probab. Statist. 31 393-427.
  • Duflo, M. (1997). Random Iterative Models. Springer, New-York.
  • Gin´e, E. and Zinn, J. (1984). Some limit theorems for empirical processes. Ann. Probab. 12 929- 989.
  • Hoffmann, M. (1999). On nonparametric estimation in nonlinear AR(1)-models. Statist. Probab. Lett. 44 29-45.
  • Kolmogorov, A. R. and Rozanov, Y. A. (1960). On the strongmixingconditions for stationary gaussian sequences. Theor. Probab. Appl. 5 204-207.
  • Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces. Springer, New York.
  • Li, K. C. (1987). Asymptotic optimality for Cp, Cl cross-validation and genralized cross-validation: discrete index set. Ann. Statist. 15 958-975.
  • Mallows, C. L. (1973). Some comments on Cp. Technometrics 15 661-675. Modha, D. S. and Masry, E. (1996) Minimum complexity regression estimation with weakly dependent observations. IEEE Trans. Inform. Theory 42 2133-2145.
  • Modha, D. S. and Masry, E. (1998). Memory-universal prediction of stationary random processes. IEEE Trans. Inform. Theory 44 117-133.
  • Neumann, M. and Kreiss, J.-P. (1998). Regression-type inference in nonparametric autoregression. Ann. Statist. 26 1570-1613.
  • Pham, D. T. and Tran, L. T. (1985). Some mixingproperties of time series models. Stochastic Process. Appl. 19 297-303.
  • Polyak, B. T. and Tsybakov, A. (1992). A family of asymptotically optimal methods for choosing the order of a projective regression estimate. Theory Probab. Appl. 37 471-481.
  • Rissanen, J. (1984). Universal coding, information, prediction and estimation. IEEE Trans. Inform. Theory 30 629-636.
  • Shibata, R. (1976). Selection of the order of an autoregressive model by Akaike's information criterion. Biometrika 63 117-126.
  • Shibata, R. (1981). An optimal selection of regression variables. Biometrika 68 45-54.
  • Talagrand, M. (1996). New concentration inequalities in product spaces. Invent. Math. 126 505- 563.
  • Viennet, G. (1997). Inequalities for absolutely regular processes: application to density estimation. Probab. Theory Related Fields 107 467-492.