Open Access
December 2017 Selecting the number of principal components: Estimation of the true rank of a noisy matrix
Yunjin Choi, Jonathan Taylor, Robert Tibshirani
Ann. Statist. 45(6): 2590-2617 (December 2017). DOI: 10.1214/16-AOS1536


Principal component analysis (PCA) is a well-known tool in multivariate statistics. One significant challenge in using PCA is the choice of the number of principal components. In order to address this challenge, we propose distribution-based methods with exact type 1 error controls for hypothesis testing and construction of confidence intervals for signals in a noisy matrix with finite samples. Assuming Gaussian noise, we derive exact type 1 error controls based on the conditional distribution of the singular values of a Gaussian matrix by utilizing a post-selection inference framework, and extending the approach of [Taylor, Loftus and Tibshirani (2013)] in a PCA setting. In simulation studies, we find that our proposed methods compare well to existing approaches.


Download Citation

Yunjin Choi. Jonathan Taylor. Robert Tibshirani. "Selecting the number of principal components: Estimation of the true rank of a noisy matrix." Ann. Statist. 45 (6) 2590 - 2617, December 2017.


Received: 1 May 2015; Revised: 1 November 2016; Published: December 2017
First available in Project Euclid: 15 December 2017

zbMATH: 06838144
MathSciNet: MR3737903
Digital Object Identifier: 10.1214/16-AOS1536

Primary: 62F03 , 62J05 , 62J07

Keywords: exact $p$-value , hypothesis test , principal components

Rights: Copyright © 2017 Institute of Mathematical Statistics

Vol.45 • No. 6 • December 2017
Back to Top