The Annals of Statistics

Deciding the dimension of effective dimension reduction space for functional and high-dimensional data

Yehua Li and Tailen Hsing

Full-text: Open access

Abstract

In this paper, we consider regression models with a Hilbert-space-valued predictor and a scalar response, where the response depends on the predictor only through a finite number of projections. The linear subspace spanned by these projections is called the effective dimension reduction (EDR) space. To determine the dimensionality of the EDR space, we focus on the leading principal component scores of the predictor, and propose two sequential χ2 testing procedures under the assumption that the predictor has an elliptically contoured distribution. We further extend these procedures and introduce a test that simultaneously takes into account a large number of principal component scores. The proposed procedures are supported by theory, validated by simulation studies, and illustrated by a real-data example. Our methods and theory are applicable to functional data and high-dimensional multivariate data.

Article information

Source
Ann. Statist., Volume 38, Number 5 (2010), 3028-3062.

Dates
First available in Project Euclid: 30 August 2010

Permanent link to this document
https://projecteuclid.org/euclid.aos/1283175988

Digital Object Identifier
doi:10.1214/10-AOS816

Mathematical Reviews number (MathSciNet)
MR2722463

Zentralblatt MATH identifier
1200.62115

Subjects
Primary: 62J05: Linear regression
Secondary: 62G20: Asymptotic properties 62M20: Prediction [See also 60G25]; filtering [See also 60G35, 93E10, 93E11]

Keywords
Adaptive Neyman test dimension reduction elliptically contoured distribution functional data analysis principal components

Citation

Li, Yehua; Hsing, Tailen. Deciding the dimension of effective dimension reduction space for functional and high-dimensional data. Ann. Statist. 38 (2010), no. 5, 3028--3062. doi:10.1214/10-AOS816. https://projecteuclid.org/euclid.aos/1283175988


Export citation

References

  • Amato, U., Antoniadis, A. and De Feis, I. (2006). Dimension reduction in functional regression with applications. Comput. Statist. Data Anal. 50 2422–2446.
  • Ash, R. B. and Gardner, M. F. (1975). Topics in Stochastic Processes. Academic Press, New York.
  • Cai, T. and Hall, P. (2006). Prediction in functional linear regression. Ann. Statist. 34 2159–2179.
  • Cambanis, S., Huang, S. and Simons, G. (1981). On the theory of elliptically contoured distributions. J. Multivariate Anal. 11 368–385.
  • Cardot, H., Ferraty, F. and Sarda, P. (2003). Spline estimators for the functional linear model. Statist. Sinica 13 571–591.
  • Cardot, H. and Sarda, P. (2005). Estimation in generalized linear models for functional data via penalized likelihood. J. Multivariate Anal. 92 24–41.
  • Carroll, R. J. and Li, K. C. (1992). Errors in variables for nonlinear regression: Dimension reduction and data visualization. J. Amer. Statist. Assoc. 87 1040–1050.
  • Cook, D. R. and Weisberg, S. (1991). Comments on “Sliced Inverse Regression for Dimension Reduction,” by K. C. Li. J. Amer. Statist. Assoc. 86 328–332.
  • Cook, D. R. (1998). Regression Graphics. Wiley, New York.
  • Crambes, C., Kneip, A. and Sarda, P. (2009). Smoothing spline estimators for functional linear regression. Ann. Statist. 37 35–72.
  • Dauxois, J., Pousse, A. and Romain, Y. (1982). Asymptotic theory for the principal component analysis of a vector of random function: Some application to statistical inference. J. Multivariate Anal. 12 136–154.
  • Eaton, M. L. and Tyler, D. (1994). The asymptotic distribution of singular values with application to canonical correlations and correspondence analysis. J. Multivariate Anal. 50 238–264.
  • Eubank, R. and Hsing, T. (2010). The Essentials of Functional Data Analysis. Unpublished manuscript. Dept. Statistics, Univ. Michigan.
  • Fan, J. and Lin, S.-K. (1998). Test of significance when data are curves. J. Amer. Statist. Assoc. 93 1007–1021.
  • Ferré, L. and Yao, A. (2003). Functional sliced inverse regression analysis. Statistics 37 475–488.
  • Ferré, L. and Yao, A. (2005). Smoothed functional sliced inverse regression. Statist. Sinica 15 665–685.
  • Ferré, L. and Yao, A. (2007). Reply to the paper “A note on smoothed functional inverse regression,” by L. Forzani and R. D. Cook. Statist. Sinica 17 1683–1687.
  • Forzani, L. and Cook, R. D. (2007). A note on smoothed functional inverse regression. Statist. Sinica 17 1677–1681.
  • Gu, C. (2002). Smoothing Spline ANOVA Models. Springer, New York.
  • Hall, P. and Hosseini-Nasab, M. (2006). On properties of functional principal components analysis. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 109–126.
  • Hall, P., Müller, H. and Wang, J. (2006). Properties of principal component methods for functional and longitudinal data analysis. Ann. Statist. 34 1493–1517.
  • Hastie, T. J. and Tibshirani, R. J. (1990). Generalized Additive Models. Chapman and Hall, New York.
  • Hsing, T. and Ren, H. (2009). An RKHS formulation of the inverse regression dimension reduction problem. Ann. Statist. 37 726–755.
  • James, G. A. and Silverman, B. W. (2005). Functional adaptive model estimation. J. Amer. Statist. Assoc. 100 565–576.
  • Li, K. C. (1991). Sliced inverse regression for dimension reduction. J. Amer. Statist. Assoc. 86 316–327.
  • Li, Y. (2007). A note on Hilbertian elliptically contoured distribution. Unpublished manuscript, Dept. Statistics, Univ. Georgia.
  • Müller, H. G. and Stadtmüller, U. (2005). Generalized functional linear models. Ann. Statist. 33 774–805.
  • Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer, New York.
  • Schoenberg, I. J. (1938). Metric spaces and completely monotone functions. Ann. Math. 39 811–841.
  • Schott, J. R. (1994). Determining the dimensionality in sliced inverse regression. J. Amer. Statist. Assoc. 89 141–148.
  • Spruill, M. C. (2007). Asymptotic distribution of coordinates on high dimensional spheres. Electron. Comm. Probab. 12 234–247.
  • Thodberg, H. H. (1996). A review of Bayesian neural networks with an application to near infrared spectroscopy. IEEE Transactions on Neural Network 7 56–72.
  • Xia, Y., Tong, H., Li, W. K. and Zhu, L.-X. (2002). An adaptive estimation of dimension reduction space (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 64 363–410.
  • Zhu, Y. and Zeng, P. (2006). Fourier methods for estimating the central subspace and the central mean subspace in regression. J. Amer. Statist. Assoc. 101 1638–1651.
  • Zhu, Y. and Zeng, P. (2008). An integral transform method for estimating the central mean and central subspace. J. Multivariate Anal. 101 271–290.
  • Zhang, J.-T. and Chen, J. (2007). Statistical inferences for functional data. Ann. Statist. 35 1052–1079.
  • Zhong, W., Zeng, P., Ma, P., Liu, J. and Zhu, Y. (2005). RSIR: Regularized sliced inverse regression for motif discovery. Bioinformatics 21 4169–4175.