The Annals of Statistics
- Ann. Statist.
- Volume 46, Number 3 (2018), 1050-1076.
Consistency of AIC and BIC in estimating the number of significant components in high-dimensional principal component analysis
In this paper, we study the problem of estimating the number of significant components in principal component analysis (PCA), which corresponds to the number of dominant eigenvalues of the covariance matrix of $p$ variables. Our purpose is to examine the consistency of the estimation criteria AIC and BIC based on the model selection criteria by Akaike [In 2nd International Symposium on Information Theory (1973) 267–281, Akadémia Kiado] and Schwarz [Estimating the dimension of a model 6 (1978) 461–464] under a high-dimensional asymptotic framework. Using random matrix theory techniques, we derive sufficient conditions for the criterion to be strongly consistent for the case when the dominant population eigenvalues are bounded, and when the dominant eigenvalues tend to infinity. Moreover, the asymptotic results are obtained without normality assumption on the population distribution. Simulation studies are also conducted, and results show that the sufficient conditions in our theorems are essential.
Ann. Statist., Volume 46, Number 3 (2018), 1050-1076.
Received: October 2015
Revised: January 2017
First available in Project Euclid: 3 May 2018
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Primary: 62H12: Estimation
Secondary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]
Bai, Zhidong; Choi, Kwok Pui; Fujikoshi, Yasunori. Consistency of AIC and BIC in estimating the number of significant components in high-dimensional principal component analysis. Ann. Statist. 46 (2018), no. 3, 1050--1076. doi:10.1214/17-AOS1577. https://projecteuclid.org/euclid.aos/1525313075