Sparse principal component analysis and iterative thresholding

Zongming Ma

doi:10.1214/13-AOS1097

April 2013 Sparse principal component analysis and iterative thresholding

Zongming Ma

Ann. Statist. 41(2): 772-801 (April 2013). DOI: 10.1214/13-AOS1097

Abstract

Principal component analysis (PCA) is a classical dimension reduction method which projects data onto the principal subspace spanned by the leading eigenvectors of the covariance matrix. However, it behaves poorly when the number of features $p$ is comparable to, or even much larger than, the sample size $n$. In this paper, we propose a new iterative thresholding approach for estimating principal subspaces in the setting where the leading eigenvectors are sparse. Under a spiked covariance model, we find that the new approach recovers the principal subspace and leading eigenvectors consistently, and even optimally, in a range of high-dimensional sparse settings. Simulated examples also demonstrate its competitive performance.

Citation

Download Citation

Zongming Ma. "Sparse principal component analysis and iterative thresholding." Ann. Statist. 41 (2) 772 - 801, April 2013. https://doi.org/10.1214/13-AOS1097

Information

Published: April 2013

First available in Project Euclid: 8 May 2013

zbMATH: 1267.62074

MathSciNet: MR3099121

Digital Object Identifier: 10.1214/13-AOS1097

Subjects:

Primary: 62H12

Secondary: 62G20 , 62H25

Keywords: Dimension reduction , High-dimensional statistics , Principal Component Analysis , principal subspace , Sparsity , spiked covariance model , thresholding

Access the abstract

JOURNAL ARTICLE
30 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY