The Annals of Statistics

A sieve M-theorem for bundled parameters in semiparametric models, with application to the efficient estimation in a linear model for censored data

Ying Ding and Bin Nan

Full-text: Open access

Abstract

In many semiparametric models that are parameterized by two types of parameters—a Euclidean parameter of interest and an infinite-dimensional nuisance parameter—the two parameters are bundled together, that is, the nuisance parameter is an unknown function that contains the parameter of interest as part of its argument. For example, in a linear regression model for censored survival data, the unspecified error distribution function involves the regression coefficients. Motivated by developing an efficient estimating method for the regression parameters, we propose a general sieve M-theorem for bundled parameters and apply the theorem to deriving the asymptotic theory for the sieve maximum likelihood estimation in the linear regression model for censored survival data. The numerical implementation of the proposed estimating method can be achieved through the conventional gradient-based search algorithms such as the Newton–Raphson algorithm. We show that the proposed estimator is consistent and asymptotically normal and achieves the semiparametric efficiency bound. Simulation studies demonstrate that the proposed method performs well in practical settings and yields more efficient estimates than existing estimating equation based methods. Illustration with a real data example is also provided.

Article information

Source
Ann. Statist., Volume 39, Number 6 (2011), 3032-3061.

Dates
First available in Project Euclid: 24 January 2012

Permanent link to this document
https://projecteuclid.org/euclid.aos/1327413777

Digital Object Identifier
doi:10.1214/11-AOS934

Mathematical Reviews number (MathSciNet)
MR3012400

Zentralblatt MATH identifier
1246.62103

Subjects
Primary: 62E20: Asymptotic distribution theory 62N01: Censored data models
Secondary: 62D05: Sampling theory, sample surveys

Keywords
Accelerated failure time model B-spline bundled parameters efficient score function semiparametric efficiency sieve maximum likelihood estimation

Citation

Ding, Ying; Nan, Bin. A sieve M-theorem for bundled parameters in semiparametric models, with application to the efficient estimation in a linear model for censored data. Ann. Statist. 39 (2011), no. 6, 3032--3061. doi:10.1214/11-AOS934. https://projecteuclid.org/euclid.aos/1327413777


Export citation

References

  • [1] Ai, C. and Chen, X. (2003). Efficient estimation of models with conditional moment restrictions containing unknown functions. Econometrica 71 1795–1843.
  • [2] Buckley, J. and James, I. (1979). Linear Regression with Censored Data. Biometrika 66 429–436.
  • [3] Chamberlain, G. (1987). Asymptotic efficiency in estimation with conditional moment restrictions. J. Econometrics 34 305–334.
  • [4] Chen, X. (2007). Large sample sieve estimation of semi-nonparametric models. In Handbook of Econometrics (J. J. Heckman and E. E. Leamer, eds.) 6B 5549–5632. Elsevier, Amsterdam.
  • [5] Chen, X., Linton, O. and Van Keilegom, I. (2003). Estimation of semiparametric models when the criterion function is not smooth. Econometrica 71 1591–1608.
  • [6] Cox, D. R. (1972). Regression models and life-tables. J. Roy. Statist. Soc. Ser. B 34 187–220.
  • [7] Ding, Y. (2010). Some new insights about the accelerated failure time model. Ph.D. thesis, Dept. Biostatistics, Univ. Michigan.
  • [8] Ding, Y. and Nan, B. (2011). Supplement to “A sieve M-theorem for bundled parameters in semiparametric models, with application to the efficient estimation in a linear model for censored data.” DOI:10.1214/11-AOS934SUPP.
  • [9] He, X. and Shao, Q.-M. (2000). On parameters of increasing dimensions. J. Multivariate Anal. 73 120–135.
  • [10] He, X., Xue, H. and Shi, N.-Z. (2010). Sieve maximum likelihood estimation for doubly semiparametric zero-inflated Poisson models. J. Multivariate Anal. 101 2026–2038.
  • [11] Huang, J. (1996). Efficient estimation for the proportional hazards model with interval censoring. Ann. Statist. 24 540–568.
  • [12] Huang, J. (1999). Efficient estimation of the partly linear additive Cox model. Ann. Statist. 27 1536–1563.
  • [13] Huang, J. and Wellner, J. A. (1997). Interval censored survival data: A review of recent progress. In Proceedings of the First Seattle Symposium in Biostatistics: Survival Analysis Lecture Notes in Statistics 123 123–169. Springer, New York.
  • [14] Jin, Z., Lin, D. Y., Wei, L. J. and Ying, Z. (2003). Rank-based inference for the accelerated failure time model. Biometrika 90 341–353.
  • [15] Jin, Z., Lin, D. Y. and Ying, Z. (2006). On least-squares regression with censored data. Biometrika 93 147–161.
  • [16] Kalbfleisch, J. D. and Prentice, R. L. (2002). The Statistical Analysis of Failure Time Data, 2nd ed. Wiley, Hoboken, NJ.
  • [17] Lai, T. L. and Ying, Z. (1991). Large sample theory of a modified Buckley–James estimator for regression analysis with censored data. Ann. Statist. 19 1370–1402.
  • [18] Miller, R. and Halpern, J. (1982). Regression with censored data. Biometrika 69 521–531.
  • [19] Nan, B., Kalbfleisch, J. D. and Yu, M. (2009). Asymptotic theory for the semiparametric accelerated failure time model with missing data. Ann. Statist. 37 2351–2376.
  • [20] Prentice, R. L. (1978). Linear rank tests with right censored data. Biometrika 65 167–179.
  • [21] Ritov, Y. (1990). Estimation in a linear regression model with censored data. Ann. Statist. 18 303–328.
  • [22] Ritov, Y. and Wellner, J. A. (1988). Censoring, martingales, and the Cox model. In Statistical Inference from Stochastic Processes (Ithaca, NY, 1987) (N. U. Prabhu, ed.). Contemporary Mathematics 80 191–219. Amer. Math. Soc., Providence, RI.
  • [23] Schumaker, L. L. (1981). Spline Functions: Basic Theory. Wiley, New York.
  • [24] Shen, X. (1997). On methods of sieves and penalization. Ann. Statist. 25 2555–2591.
  • [25] Shen, X. and Wong, W. H. (1994). Convergence rate of sieve estimates. Ann. Statist. 22 580–615.
  • [26] Tsiatis, A. A. (1990). Estimating regression parameters using linear rank tests for censored data. Ann. Statist. 18 354–372.
  • [27] van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics 3. Cambridge Univ. Press, Cambridge.
  • [28] van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York.
  • [29] Wei, L. J., Ying, Z. and Lin, D. Y. (1990). Linear regression analysis of censored survival data based on rank tests. Biometrika 77 845–851.
  • [30] Wellner, J. A. and Zhang, Y. (2007). Two likelihood-based semiparametric estimation methods for panel count data with covariates. Ann. Statist. 35 2106–2142.
  • [31] Ying, Z. (1993). A large sample study of rank estimation for censored regression data. Ann. Statist. 21 76–99.
  • [32] Zeng, D. and Lin, D. Y. (2007). Efficient estimation for the accelerated failure time model. J. Amer. Statist. Assoc. 102 1387–1396.
  • [33] Zhang, Y., Hua, L. and Huang, J. (2010). A spline-based semiparametric maximum likelihood estimation method for the Cox model with interval-censored data. Scand. J. Stat. 37 338–354.

Supplemental materials