## Brazilian Journal of Probability and Statistics

- Volume 28, Number 2 (2014), 255-274.

### PCA and eigen-inference for a spiked covariance model with largest eigenvalues of same asymptotic order

Addy Bolivar-Cime and Victor Perez-Abreu

#### Abstract

In this paper, we work under the setting of data with high dimension $d$ greater than the sample size $n$ (HDLSS). We study asymptotics of the first $p\geq2$ sample eigenvalues and their corresponding eigenvectors under a spiked covariance model for which its first $p$ largest population eigenvalues have the same asymptotic order of magnitude as $d$ tends to infinity and the rest are constant. We get the asymptotic joint distribution of the nonzero sample eigenvalues when $d\rightarrow\infty$ and the sample size $n$ is fixed. We then prove that the $p$ largest sample eigenvalues increase jointly at the same speed as their population counterpart, in the sense that the vector of ratios of the sample and population eigenvalues converges to a multivariate distribution when $d\rightarrow\infty$ and $n$ is fixed, and to the vector of ones when both $d,n\rightarrow\infty$ and $d\gg n$. We also show the subspace consistency of the corresponding sample eigenvectors when $d$ goes to infinity and $n$ is fixed. Furthermore, using the asymptotic joint distribution of the sample eigenvalues we study some inference problems for the spiked covariance model and propose hypothesis tests for a particular case of this model and confidence intervals for the $p$ largest eigenvalues. A simulation is performed to assess the behavior of the proposed statistical methodologies.

Braz. J. Probab. Stat., Volume 28, Number 2 (2014), 255-274.

First available in Project Euclid: 4 April 2014

https://projecteuclid.org/euclid.bjps/1396615440

doi:10.1214/12-BJPS205

**Keywords**

Principal Component Analysis spiked covariance model eigen-inference hypothesis test confidence interval high dimensional data HDLSS

