The Annals of Statistics
- Ann. Statist.
- Volume 46, Number 6A (2018), 2871-2903.
Sub-Gaussian estimators of the mean of a random matrix with heavy-tailed entries
Estimation of the covariance matrix has attracted a lot of attention of the statistical research community over the years, partially due to important applications such as principal component analysis. However, frequently used empirical covariance estimator, and its modifications, is very sensitive to the presence of outliers in the data. As P. Huber wrote [Ann. Math. Stat. 35 (1964) 73–101], “…This raises a question which could have been asked already by Gauss, but which was, as far as I know, only raised a few years ago (notably by Tukey): what happens if the true distribution deviates slightly from the assumed normal one? As is now well known, the sample mean then may have a catastrophically bad performance….” Motivated by Tukey’s question, we develop a new estimator of the (element-wise) mean of a random matrix, which includes covariance estimation problem as a special case. Assuming that the entries of a matrix possess only finite second moment, this new estimator admits sub-Gaussian or sub-exponential concentration around the unknown mean in the operator norm. We explain the key ideas behind our construction, and discuss applications to covariance estimation and matrix completion problems.
Ann. Statist., Volume 46, Number 6A (2018), 2871-2903.
Received: September 2016
Revised: August 2017
First available in Project Euclid: 7 September 2018
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Primary: 60B20: Random matrices (probabilistic aspects; for algebraic aspects see 15B52) 62G35: Robustness
Secondary: 62H12: Estimation
Minsker, Stanislav. Sub-Gaussian estimators of the mean of a random matrix with heavy-tailed entries. Ann. Statist. 46 (2018), no. 6A, 2871--2903. doi:10.1214/17-AOS1642. https://projecteuclid.org/euclid.aos/1536307236
- Supplementary material for the paper: Sub-Gaussian estimators of the mean of a random matrix with heavy-tailed entries. The supplement contains technical details and proofs not included in the main text of the paper.