Electronic Journal of Statistics

Estimating the error distribution in semiparametric transformation models

Cédric Heuchenne, Rawane Samb, and Ingrid Van Keilegom

Full-text: Open access


In this paper we consider the semiparametric transformation model $\Lambda_{\theta_{o}}(Y)=m(X)+\varepsilon$, where $\theta_{o}$ is an unknown finite dimensional parameter, the function $m(\cdot)=\mathbb{E}(\Lambda_{\theta_{o}}(Y)|X=\cdot)$ is “smooth”, but otherwise unknown, and the covariate $X$ is independent of the error $\varepsilon$. An estimator of the distribution function of $\varepsilon$ is investigated and its weak convergence is proved. The proposed estimator depends on a profile likelihood estimator of $\theta_{o}$ and a nonparametric kernel estimator of $m$. We also evaluate the practical performance of our estimator in a simulation study for several models and sample sizes. Finally, the method is applied to a data set on the scattering of sunlight in the atmosphere.

Article information

Electron. J. Statist., Volume 9, Number 2 (2015), 2391-2419.

Received: October 2014
First available in Project Euclid: 29 October 2015

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G08: Nonparametric regression
Secondary: 62E20: Asymptotic distribution theory

Empirical distribution function kernel smoothing nonparametric regression profile likelihood estimator semiparametric regression transformation model


Heuchenne, Cédric; Samb, Rawane; Van Keilegom, Ingrid. Estimating the error distribution in semiparametric transformation models. Electron. J. Statist. 9 (2015), no. 2, 2391--2419. doi:10.1214/15-EJS1057. https://projecteuclid.org/euclid.ejs/1446124646

Export citation


  • [1] Akritas, M.G. and Van Keilegom, I. (2001). Non-parametric estimation of the residual distribution., Scand. J. Statist., 28, 549–567.
  • [2] Bellver, C. (1987). Influence of particulate pollution on the positions of neutral points in the sky in Seville (Spain)., Atmospheric Environment, 21, 699–702
  • [3] Bickel, P.J. and Doksum, K. (1981). An analysis of transformations revisited., J. Amer. Statist. Assoc., 76, 296–311.
  • [4] Box, G.E.P. and Cox, D.R. (1964). An analysis of transformations., J. Roy. Statist. Soc. - Ser. B, 26, 211–252.
  • [5] Carroll, R.J. and Ruppert, D. (1988)., Transformation and Weighting in Regression. Chapman and Hall, New York.
  • [6] Chen, G., Lockhart, R.A. and Stephens, A. (2002). Box-Cox transformations in linear models: Large sample theory and tests of normality (with discussion)., Canad. J. Statist., 30, 177–234.
  • [7] Cheng, F. and Sun, S. (2008). A goodness-of-fit test of the errors in nonlinear autoregressive time series models., Statist. Probab. Letters, 78, 50–59.
  • [8] Cleveland, W.S. (1993)., Visualizing Data. Hobart Press, Summit.
  • [9] Dette, H., Neumeyer, N. and Van Keilegom, I. (2007). A new test for the parametric form of the variance function in nonparametric regression., J. Royal Statist. Soc. - Series B, 69, 903–917.
  • [10] Dette, H., Pardo-Fernández, J.C. and Van Keilegom, I. (2009). Goodness-of-fit tests for multiplicative models with dependent data., Scand. J. Statist., 36, 782–799.
  • [11] Ding, Y. and Nan, B. (2011). A sieve M-theorem for bundled parameters in semiparametric models, with application to the efficient estimation in a linear model for censored data., Ann. Statist., 6, 3032–3061.
  • [12] Einmahl, U. and Mason, D.M. (2005). Uniform in bandwidth consistency of kernel-type function estimators., Ann. Statist., 33, 1380–1403.
  • [13] Einmahl, J. and Van Keilegom, I. (2008a). Tests for independence in nonparametric regression., Statist. Sinica, 18, 601–616.
  • [14] Einmahl, J. and Van Keilegom, I. (2008b). Specification tests in nonparametric regression., J. Econometrics, 143, 88–102.
  • [15] Fitzenberger, B., Wilke, R.A. and Zhang, X. (2010). Implementing Box-Cox quantile regression., Econometric Rev., 29, 158–181.
  • [16] Florens, J.-P., Simar, L. and Van Keilegom, I. (2014). Frontier estimation in nonparametric location-scale models., J. Econometrics, 178, 456–470.
  • [17] Freeman, J. and Modarres, R. (2005). Efficiency of test for independence after Box-Cox transformation., J. Multivar. Anal., 95, 107–118.
  • [18] González-Manteiga, W., Pardo-Fernández, J.C. and Van Keilegom, I. (2011). ROC curves in nonparametric location-scale regression models., Scand. J. Statist., 38, 169–184.
  • [19] Hansen, B.E. (2008). Uniform convergence rates for kernel estimation with dependent data., Econometric Theory, 24, 726–748.
  • [20] Hart, J.D. (1997)., Nonparametric Smoothing and Lack-of-fit Tests. Springer, New-York.
  • [21] Heuchenne, C. and Van Keilegom, I. (2010). Goodness-of-fit tests for the error distribution in nonparametric regression., Comput. Statist. Data Anal., 54, 1942–1951.
  • [22] Hlávka, Z., Husková, M. and Meintanis, S.G. (2011). Tests for independence in nonparametric heteroscedastic regression models., J. Multiv. Anal., 102, 816–827.
  • [23] Johnson, N.L. (1949). Systems of frequency curves generated by methods of translation., Biometrika, 36, 149–76.
  • [24] Linton, O., Sperlich, S. and Van Keilegom, I. (2008). Estimation of a semiparametric transformation model., Ann. Statist., 36, 686–718.
  • [25] Manly, B.F. (1976). Exponential data transformation., The Statistician, 25, 37–42.
  • [26] Müller, U.U., Schick, A. and Wefelmeyer, W. (2004). Estimating linear functionals of the error distribution in nonparametric regression., J. Statist. Plann. Infer., 119, 75–93.
  • [27] Müller, U.U., Schick, A. and Wefelmeyer, W. (2007). Estimating the error distribution function in semiparametric regression., Statistics & Decisions, 25, 1–18.
  • [28] Nadaraya, E.A. (1964). On estimating regression., Theory of Probability and its Applications, 9, 141—142.
  • [29] Neumeyer, N. (2009a). Smooth residual bootstrap for empirical processes of nonparametric regression residuals., Scand. J. Statist., 36, 204–228.
  • [30] Neumeyer, N. (2009b). Testing independence in nonparametric regression., J. Multiv. Anal., 100, 1551–1566.
  • [31] Neumeyer, N. and Dette, H. (2007). Testing for symmetric error distribution in nonparametric regression models., Statist. Sinica, 17, 775–795.
  • [32] Neumeyer, N. and Pardo-Fernández, J.P. (2009). A simple test for comparing regression curves versus one-sided alternatives., J. Statist. Plann. Infer., 139, 4006–4016.
  • [33] Neumeyer, N. and Van Keilegom, I. (2010). Estimating the error distribution in nonparametric multiple regression with applications to model testing., J. Multiv. Anal., 101, 1067–1078.
  • [34] Pardo-Fernández, J.C., Van Keilegom, I. and González-Manteiga, W. (2007). Testing for the equality of $k$ regression curves., Statist. Sinica, 17, 1115–1137.
  • [35] Reiss, R.-D. (1981). Nonparametric estimation of smooth distribution functions., Scand. J. Statist., 8, 116–119.
  • [36] Robinson, P.M. (1991). Best nonlinear three-stage least squares estimation of certain econometric models., Econometrica, 59, 755–786.
  • [37] Sakia, R.M. (1992). The Box-Cox transformation technique: A review., The Statistician, 41, 169–178.
  • [38] Shin, Y. (2008). Semiparametric estimation of the Box-Cox transformation model., Econometrics J., 11, 517–537.
  • [39] Van der Vaart A.W. and Wellner, J.A. (1996)., Weak Convergence and Empirical Processes. Springer-Verlag, New York.
  • [40] Vanhems, A. and Van Keilegom, I. (2013). Semiparametric transformation model with endogeneity: A control function approach, (submitted).
  • [41] Van Keilegom, I., González-Manteiga, W. and Sánchez Sellero, C. (2008). Goodness of fit tests in parametric regression based on the estimation of the error distribution., TEST, 17, 401–415.
  • [42] Watson, G.S. (1964). Smooth regression analysis., Sankhy$\overlinea$ - Ser. A, 26, 359–372.
  • [43] Zellner, A. and Revankar, N.S. (1969). Generalized production functions., Rev. Economic Studies, 36, 241–250.
  • [44] Zhang, C.M. (2003). Adaptative tests of regression functions via multiscale generalized likelihood ratio., Canadian Journal of Statistics, 31, 151–171.