The Annals of Statistics

Testing the order of a model

Antoine Chambaz

Full-text: Open access


This paper deals with order identification for nested models in the i.i.d. framework. We study the asymptotic efficiency of two generalized likelihood ratio tests of the order. They are based on two estimators which are proved to be strongly consistent. A version of Stein’s lemma yields an optimal underestimation error exponent. The lemma also implies that the overestimation error exponent is necessarily trivial. Our tests admit nontrivial underestimation error exponents. The optimal underestimation error exponent is achieved in some situations. The overestimation error can decay exponentially with respect to a positive power of the number of observations.

These results are proved under mild assumptions by relating the underestimation (resp. overestimation) error to large (resp. moderate) deviations of the log-likelihood process. In particular, it is not necessary that the classical Cramér condition be satisfied; namely, the log-densities are not required to admit every exponential moment. Three benchmark examples with specific difficulties (location mixture of normal distributions, abrupt changes and various regressions) are detailed so as to illustrate the generality of our results.

Article information

Ann. Statist., Volume 34, Number 3 (2006), 1166-1203.

First available in Project Euclid: 10 July 2006

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 60F10: Large deviations 60G57: Random measures 62C99: None of the above, but in this section 62F03: Hypothesis testing 62F05: Asymptotic properties of tests 62F12: Asymptotic properties of estimators

Abrupt changes empirical processes error exponents hypothesis testing large deviations mixtures model selection moderate deviations order estimation


Chambaz, Antoine. Testing the order of a model. Ann. Statist. 34 (2006), no. 3, 1166--1203. doi:10.1214/009053606000000344.

Export citation


  • Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans. Automatic Control 19 716--723.
  • Azencott, R. and Dacunha-Castelle, D. (1986). Series of Irregular Observations. Springer, New York.
  • Bahadur, R. R. (1967). An optimal property of the likelihood ratio statistic. Proc. Fifth Berkeley Symp. Math. Statist. Probab. 1 13--26. Univ. California Press, Berkeley.
  • Bahadur, R. R. (1971). Some Limit Theorems in Statistics. SIAM, Philadelphia.
  • Bahadur, R. R., Zabell, S. L. and Gupta, J. C. (1980). Large deviations, tests, and estimates. In Asymptotic Theory of Statistical Tests and Estimation (I. M. Chakravarti, ed.) 33--64. Academic Press, New York.
  • Barron, A., Birgé, L. and Massart, P. (1999). Risk bounds for model selection via penalization. Probab. Theory Related Fields 113 301--413.
  • Boucheron, S. and Gassiat, E. (2005). An information-theoretic perspective on order estimation. In Inference in Hidden Markov Models (O. Cappé, E. Moulines and T. Rydén, eds.) 565--601. Springer, New York.
  • Boucheron, S. and Gassiat, E. (2006). Error exponents for AR order testing. IEEE Trans. Inform. Theory 52 472--488.
  • Čencov, N. N. (1982). Statistical Decision Rules and Optimal Inference. Amer. Math. Soc., Providence, RI.
  • Chernoff, H. (1956). Large sample theory: Parametric case. Ann. Math. Statist. 27 1--22.
  • Csiszár, I. (1975). $I$-divergence geometry of probability distributions and minimization problems. Ann. Probab. 3 146--158.
  • Csiszár, I. (2002). Large-scale typicality of Markov sample paths and consistency of MDL order estimators. IEEE Trans. Inform. Theory 48 1616--1628.
  • Csiszár, I. and Körner, J. (1981). Information Theory: Coding Theorems for Discrete Memoryless Systems. Academic Press, New York.
  • Csiszár, I. and Shields, P. C. (2000). The consistency of the BIC Markov order estimator. Ann. Statist. 28 1601--1619.
  • Dacunha-Castelle, D. and Gassiat, E. (1997). The estimation of the order of a mixture model. Bernoulli 3 279--299.
  • Dacunha-Castelle, D. and Gassiat, E. (1999). Testing the order of a model using locally conic parametrization: Population mixtures and stationary ARMA processes. Ann. Statist. 27 1178--1209.
  • Dembo, A. and Zeitouni, O. (1998). Large Deviations Techniques and Applications, 2nd ed. Springer, New York.
  • Dudley, R. M. and Philipp, W. (1983). Invariance principles for sums of Banach space valued random elements and empirical processes. Z. Wahrsch. Verw. Gebiete 62 509--552.
  • Dupuis, P. and Ellis, R. S. (1997). A Weak Convergence Approach to the Theory of Large Deviations. Wiley, New York.
  • Finesso, L., Liu, C.-C. and Narayan, P. (1996). The optimal error exponent for Markov order estimation. IEEE Trans. Inform. Theory 42 1488--1497.
  • Gassiat, E. (2002). Likelihood ratio inequalities with applications to various mixtures. Ann. Inst. H. Poincaré Probab. Statist. 38 897--906.
  • Gassiat, E. and Boucheron, S. (2003). Optimal error exponents in hidden Markov models order estimation. IEEE Trans. Inform. Theory 49 964--980.
  • Guyon, X. and Yao, J. (1999). On the underfitting and overfitting sets of models chosen by order selection criteria. J. Multivariate Anal. 70 221--249.
  • Hannan, E. J., McDougall, A. J. and Poskitt, D. S. (1989). Recursive estimation of autoregressions. J. Roy. Statist. Soc. Ser. B 51 217--233.
  • Haughton, D. (1989). Size of the error in the choice of a model to fit data from an exponential family. Sankhyā Ser. A 51 45--58.
  • Hemerly, E. M. and Davis, M. H. A. (1991). Recursive order estimation of autoregressions without bounding the model set. J. Roy. Statist. Soc. Ser. B 53 201--210.
  • Henna, J. (1985). On estimating of the number of constituents of a finite mixture of continuous distributions. Ann. Inst. Statist. Math. 37 235--240.
  • Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. Proc. Fifth Berkeley Symp. Math. Statist. Probab. 1 221--233. Univ. California Press, Berkeley.
  • James, L. F., Priebe, C. E. and Marchette, D. J. (2001). Consistent estimation of mixture complexity. Ann. Statist. 29 1281--1296.
  • Keribin, C. (2000). Consistent estimation of the order of mixture models. Sankhyā Ser. A 62 49--66.
  • Keribin, C. and Haughton, D. (2003). Asymptotic probabilities of overestimating and underestimating the order of a model in general regular families. Comm. Statist. Theory Methods 32 1373--1404.
  • Léonard, C. and Najim, J. (2002). An extension of Sanov's theorem. Application to the Gibbs conditioning principle. Bernoulli 8 721--743.
  • Leonardi, G. P. and Tamanini, I. (2002). Metric spaces of partitions, and Caccioppoli partitions. Adv. Math. Sci. Appl. 12 725--753.
  • Leroux, B. G. (1992). Consistent estimation of a mixing distribution. Ann. Statist. 20 1350--1360.
  • Mallows, C. L. (1973). Some comments on $C_P$. Technometrics 15 661--675.
  • Massart, P. (2000). Some applications of concentration inequalities to statistics. Probability theory. Ann. Fac. Sci. Toulouse Math. (6) 9 245--303.
  • Pollard, D. (1985). New ways to prove central limit theorems. Econometric Theory 1 295--314.
  • Rissanen, J. (1978). Modelling by shortest data description. Automatica 14 465--471.
  • Rockafellar, R. T. (1970). Convex Analysis. Princeton Univ. Press.
  • Schied, A. (1998). Cramer's condition and Sanov's theorem. Statist. Probab. Lett. 39 55--60.
  • Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461--464.
  • Titterington, D. M., Smith, A. F. M. and Makov, U. E. (1985). Statistical Analysis of Finite Mixture Distributions. Wiley, Chichester.
  • van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Univ. Press.
  • Wu, L. (1994). Large deviations, moderate deviations and LIL for empirical processes. Ann. Probab. 22 17--27.