The Annals of Statistics

Validation of linear regression models

Holger Dette and Axel Munk

Full-text: Open access


A new test is proposed in order to verify that a regression function, say $g$, has a prescribed (linear) parametric form. This procedure is based on the large sample behavior of an empirical $L^2$-distance between $g$ and the subspace $U$ spanned by the regression functions to be verified. The asymptotic distribution of the test statistic is shown to be normal with parameters depending only on the variance of the observations and the $L^2$-distance between the regression function $g$ and the model space $U$. Based on this result, a test is proposed for the hypothesis that "$g$ is not in a preassigned $L^2$-neighborhood of $U$," whichallows the "verification" of the model $U$ at a controlled type I error rate. The suggested procedure is very easy to apply because of its asymptotic normal law and the simple form of the test statistic. In particular, it does not require nonparametric estimators of the regression function and hence, the test does not depend on the subjective choice of smoothing parameters.

Article information

Ann. Statist., Volume 26, Number 2 (1998), 778-800.

First available in Project Euclid: 31 July 2002

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G05: Estimation
Secondary: 62G10: Hypothesis testing 62G30: Order statistics; empirical distribution functions 62G07: Density estimation

Nonparametric model check validation of goodness of fit $L^2$-distance equivalence of regression functions


Dette, Holger; Munk, Axel. Validation of linear regression models. Ann. Statist. 26 (1998), no. 2, 778--800. doi:10.1214/aos/1028144860.

Export citation


  • Achieser, N. J. (1956). Theory of Approximation. Dover, New York.
  • Barry, D. and Hartigan, J. A. (1990). An omnibus test for departure from constant mean. Ann. Statist. 18 1340-1357.
  • Berger, J. O. and Delampady, M. (1987). Testing precise hy potheses. Statist. Sci. 2 317-352.
  • Bordeau, F. (1993). Tests for the choice of approximative models in nonlinear regression when the variance is unknown. Statistics 24 95-106.
  • Breiman, L. and Meisel, W. S. (1976). General estimates of the intrinsic variablity of data in nonlinear regression models. J. Amer. Statist. Assoc. 71 301-307.
  • Chow, S. C. and Liu, J. P. (1992). Design and Analy sis of Bioavailability and Bioequivalence Studies. Dekker, New York.
  • Cox, D., Koh, E., Wahba, G. and Yandell, B. S. (1988). Testing the (parametric) null model hy pothesis in (semiparametric) partial and generalized spline models. Ann. Statist. 16 113-119.
  • Davison, A. C. and Tsai, C. L. (1992). Regression model diagnostics. Internat. Statist. Rev. 60 337-353.
  • Delgado, M. A. (1993). Testing the equality of nonparametric regression curves. Statist. Probab. Lett. 17 199-204.
  • Dette, H. and Munk, A. (1997). A simple goodness-of-fit test for linear models under a random design assumption. Ann. Inst. Statist. Math. To appear.
  • Dette, H., Munk, A. and Wagner, T. (1998). Estimating the variance in nonparametric regression by quadratic forms-what is a reasonable choice? J. Roy. Statist. Soc. Ser. B. To appear.
  • Dieboldt, J. (1995). A nonparametric test for the regression function: asy mptotic theory. J. Statist. Plann. Inference 44 1-17.
  • Eubank, R. L. and Hart, J. D. (1992). Testing goodness of fit in regression via order selection criteria. Ann. Statist. 20 1412-1425.
  • Eubank, R. L. and Spiegelmann, C. H. (1990). Testing the goodness of fit of a linear model via regression techniques. J. Amer. Statist. Assoc. 85 387-392.
  • Gasser, T., Sroka, L. and Jennen-Steinmetz, C. (1986). Residual variance and residual pattern in nonlinear regression. Biometrika 73 625-633.
  • Hall, P. and Hart, J. D. (1990). Bootstrap test for difference between means in nonparametric regression. J. Amer. Statist. Assoc. 85 1039-1049.
  • Hall, P. and Marron, J. S. (1990). On variance estimation in nonparametric regression. Biometrika 77 415-419.
  • H¨ardle, W. and Marron, J. S. (1990). Semiparametric comparison of regression curves. Ann. Statist. 18 83-89.
  • H¨ardle, W. and Mammen, E. (1993). Comparing nonparametric versus parametric regression fits. Ann. Statist. 21 1926-1947.
  • King, E. C., Hart, J. D. and Wehrly, T. E. (1991). Testing the equality of regression curves using linear smoothers. Statist. Probab. Lett. 12 239-247.
  • MacKinnon, J. G. (1992). Model specification tests and artificial regressions. J. Economic Lit. 30 102-146.
  • Mandallaz, D. and Mau, J. (1981). Comparison of different methods for decision-making in bioequivalence assessment. Biometrics 37 213-222.
  • Metzler, C. M. (1974). Bioavailiability: a problem in equivalence. Biometrics 30 309-317.
  • M ¨uller, H. G. (1992). Goodness of fit diagnostics for regression models. Scand. J. Statist. 19 157-172.
  • Neil, J. W. and Johnson, D. E. (1985). Testing linear regression function adequacy without replication. Ann. Statist. 13 1482-1489.
  • Orey, S. (1958). A central limit theorem for m-dependent random variables. Duke Math. J. 52 543-546.
  • Rice, J. (1984). Bandwidth choice for nonparametric regression. Ann. Statist. 12 1215-1230.
  • Sacks, J. and Ylvisaker, D. (1970). Designs for regression problems with correlated errors III. Ann. Math. Statist. 41 2057-2074.
  • Schuirmann, D. L. (1987). A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability. J. Pharmacocinetics and Biopharmaceutics 15 657-680.
  • Shillington, E. R. (1979). Testing lack of fit in regression without replication. Canad. J. Statist. 7 137-146.
  • Silverman, B. W. (1985). Some aspects of the spline smoothing approach to non-parametric regression curve fitting. J. Roy. Statist. Soc. Ser. B 47 1-52.
  • Staudte, R. G. and Sheather, S. J. (1990). Robust Estimation and Testing. Wiley, New York.
  • Staniswalis, J. G. and Severini, T. A. (1991). Diagnostics for assessing regression models. J. Amer. Statist. Assoc. 86 684-692.
  • Stute, W. (1997). Nonparametric model checks for regression. Ann. Statist. 25 613-641.
  • Whaba, G. (1978). Improper priors, spline smoothing, and the problem of guarding against model errors in regression. J. Roy. Statist. Soc. Ser. B 40 364-372.
  • Whittle, P. (1964). On the convergence to normality of quadratic forms in independent variables. Theory. Probab. Appl. 9 103-108.
  • Yanagimoto, T. and Yanagimoto, M. (1987). The use of marginal likelihood for a diagnostic test for the goodness of fit of the simple linear regression model. Technometrics 29 95-107.
  • Zielke, R. (1979). Discontinous Ceby sev-Sy stems. Lecture Notes in Math. 707. Springer, Berlin.
  • Zwanzig, S. (1980). The choice of approximate models in nonlinear regression. Statistics 11 23-47.