Electronic Journal of Statistics

Two proposals for robust PCA using semidefinite programming

Michael McCoy and Joel A. Tropp

Full-text: Open access

Abstract

The performance of principal component analysis suffers badly in the presence of outliers. This paper proposes two novel approaches for robust principal component analysis based on semidefinite programming. The first method, maximum mean absolute deviation rounding, seeks directions of large spread in the data while damping the effect of outliers. The second method produces a low-leverage decomposition of the data that attempts to form a low-rank model for the data by separating out corrupted observations. This paper also presents efficient computational methods for solving these semidefinite programs. Numerical experiments confirm the value of these new techniques.

Article information

Source
Electron. J. Statist., Volume 5 (2011), 1123-1160.

Dates
First available in Project Euclid: 15 September 2011

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1316092870

Digital Object Identifier
doi:10.1214/11-EJS636

Mathematical Reviews number (MathSciNet)
MR2836771

Zentralblatt MATH identifier
1329.62276

Subjects
Primary: 60H25: Random operators and equations [See also 47B80] 62G35: Robustness
Secondary: 90C22: Semidefinite programming

Keywords
Robustness principal component analysis semidefinite relaxation leverage duality

Citation

McCoy, Michael; Tropp, Joel A. Two proposals for robust PCA using semidefinite programming. Electron. J. Statist. 5 (2011), 1123--1160. doi:10.1214/11-EJS636. https://projecteuclid.org/euclid.ejs/1316092870


Export citation

References

  • [1] Aldrin, M. (2004). NO2.dat., http://lib.stat.cmu.edu/datasets/NO2.dat.
  • [2] Alon, N. and Naor, A. (2006). Approximating the cut-norm via Grothendieck’s inequality., SIAM J. Comput. 35 787.
  • [3] Bhatia, R. (1997)., Matrix analysis 169. Springer Verlag.
  • [4] Brubaker, S. C. (2009). Robust PCA and clustering in noisy mixtures. In, Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms 1078–1087. SIAM.
  • [5] Buckheit, J. and Donoho, D. (1995). Wavelab and reproducible research., Wavelets and Statistics 55–81.
  • [6] Burer, S. and Monteiro, R. D. C. (2003). A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization., Math. Program. 95 329–357.
  • [7] Burer, S. and Monteiro, R. D. C. (2004). Local minima and convergence in low-rank semidefinite programming., Math. Program. 103 427–444.
  • [8] Candes, E. J., Li, X., Ma, Y. and Wright, J. (2009). Robust principal component analysis?, J. Assoc. Comput. Mach. 58. arXiv:0912.3599.
  • [9] Chandrasekaran, V., Sanghavi, S., Parrilo, P. A. and Willsky, A. S. (2011). Rank-sparsity incoherence for matrix decomposition., SIAM J. Opt. 21 572–596.
  • [10] Combettes, P. L. and Wajs, V. R. (2006). Signal recovery by proximal forward-backward splitting., Multiscale Modeling and Simulation 4 1168–1200.
  • [11] Croux, C., Filzmoser, P. and Oliveira, M. R. (2007). Algorithms for projection-pursuit robust principal component analysis., Chemom. Intell. Lab. Syst. 87 218–225.
  • [12] Croux, C. and Haesbroeck, H. (2000). Principal component analysis based on robust estimators of the covariance or correlation matrix: influence functions and efficiencies., Biometrika 87 603–618.
  • [13] Croux, C. and Ruiz-Gazen, A. (2005). High breakdown estimators for principal components: the projection-pursuit approach revisited., J. Multivariate Anal. 95 206–226.
  • [14] Cui, H. (2003). Asymptotic distributions of principal components based on robust dispersions., Biometrika 90 953–966.
  • [15] Dasgupta, S. (2003). Subspace detection: a robust statistics formulation. In, Learning Theory and Kernel Machines, ( B. Schölkopf and M. Warmuth, eds.). Lecture Notes in Computer Science 2777 734-734. Springer.
  • [16] De La Torre, F. and Black, M. J. (2003). A framework for robust subspace learning., Int. J. Comput. Vision 54 117–142.
  • [17] Devlin, S. J., Gnanadesikan, R. and Kettenring, J. R. (1981). Robust estimation of dispersion matrices and principal components., J. Am. Stat. Assoc. 76 354–362.
  • [18] Dunagan, J. and Vempala, S. (2001). Optimal outlier removal in high-dimensional. In, Proceedings of the thirty-third annual ACM symposium on Theory of computing. STOC ’01 627–636. ACM, New York, NY, USA.
  • [19] Eckstein, J. and Bertsekas, D. P. (1992). On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators., Math. Program. 55 293–318.
  • [20] Fazel, M. (2002). Matrix rank minimization with applications Dissertation, Stanford University, Stanford, CA.
  • [21] Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems., Ann. Eugenic. 7 179–188.
  • [22] Frank, A. and Asuncion, A. (2010). UCI Machine Learning Repository., http://archive.ics.uci.edu/ml/datasets/Iris.
  • [23] Gabay, D. and Mercier, B. (1976). A dual algorithm for the solution of nonlinear variational problems via finite element approximations., Comp. Math. Appl. 2 17–40.
  • [24] Glowinski, R. and Marocco, A. (1975). Sur l’approximation, par elements finis d’ordre un, et la resolution, par penalisation-dualité, d’une classe de problemes de Dirichlet non lineares., Revue Française Automat. Informat. Recherche Opérationelle Ser. Rouge Anal. Numér. 9 41–76.
  • [25] Gnanadesikan, R. and Kettenring, J. R. (1972). Robust estimates, residuals, and outlier detection with multiresponse data., Biometrics 28 81–124.
  • [26] Grant, M. and Boyd, S. (2008). Graph implementations for nonsmooth convex programs. In, Recent Advances in Learning and Control, ( V. Blondel, S. Boyd and H. Kimura, eds.). Lecture Notes in Control and Information Sciences 95–110. Springer-Verlag Limited, London. http://stanford.edu/~boyd/graph_dcp.html.
  • [27] Grant, M. and Boyd, S. (2010). CVX: Matlab Software for Disciplined Convex Programming, version 1.21., http://cvxr.com/cvx.
  • [28] Grothendieck, A. (1953). Résumé de la théorie métrique des produits tensoriels topologiques (French)., Bol. Soc. Mat. So Paulo 8 1–79.
  • [29] Hager, W. W. and Zhang, H. (2006). Algorithm 851: CG_DESCENT, a conjugate gradient method with guaranteed descent., ACM Trans. Math. Software 32 137.
  • [30] Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components., Journal of Educational Psychology 24 417–441.
  • [31] Huber, P. J. (1981)., Robust statistics, First ed. Wiley, Hoboken, New Jersey.
  • [32] Huber, P. J. and Ronchetti, E. (2009)., Robust statistics, Second ed. Wiley, Hoboken, New Jersey.
  • [33] Kreyszig, E. (1989)., Introductory functional analysis with applications. Wiley Classics Library. Wiley, USA.
  • [34] Kwak, N. (2008). Principal component analysis based on L1-norm maximization., IEEE Trans. Pattern Anal. Mach. Intell. 1672–1680.
  • [35] Li, G. and Chen, Z. (1985). Projection-pursuit approach to robust dispersion matrices and principal components: primary theory and Monte Carlo., J. Am. Stat. Assoc. 80 759–766.
  • [36] Lin, Z., Chen, M., Wu, L. and Ma, Y. (2009). The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices Technical Report, UIUC UILU-ENG-09-2215., arXiv:1009.5055.
  • [37] Liu, G., Lin, Z. and Yu, Y. (2010). Robust subspace segmentation by low-rank representation. In, Proceedings of the 26th International Conference on Machine Learning (ICML). Citeseer.
  • [38] Locantore, N., Marron, J. S., Simpson, D. G., Tripoli, N., Zhang, J. T. and Cohen, K. L. (1999). Robust principal component analysis for functional data., Test 8 1–73.
  • [39] Maronna, R. A. (2005). Principal components and orthogonal regression based on robust scales., Technometrics 47 264–273.
  • [40] Maronna, R. A., Martin, D. R. and Yohai, V. J. (2006)., Robust statistics: theory and methods. Wiley Series in Probability and Statistics. Wiley, Hoboken, NJ.
  • [41] McCoy, M. and Tropp, J. A. (2010). Online Code., http://users.cms.caltech.edu/~mccoy/code/robustPCA_code2.tar.gz.
  • [42] Montgomery, D. C., Peck, E. A. and Vining, G. G. (2006)., Introduction to Linear Regression Analysis. Wiley Series in Probability and Statistics. Wiley, Hoboken, NJ.
  • [43] Nemirovski, A. (2007). Sums of random symmetric matrices and quadratic optimization under orthogonality constraints., Math. Program. 109 283–317.
  • [44] Nesterov, Y. E. (1998). Semidefinite relaxation and nonconvex quadratic optimization., Optim. Methods Softw. 9 141–160.
  • [45] Paley, R. E. A. C. and Zygmund, A. (1932). A note on analytic functions in the unit circle., Math. Proc. Cambridge Philos. Soc. 28 266–272.
  • [46] Pisier, G. (1986)., Factorization of linear operators and geometry of Banach spaces. Regional Conference Series in Mathematics. American Mathematical Society, Providence, RI.
  • [47] Rao, B. D. and Kreutz-Delgado, K. (1998). Sparse solutions to linear inverse problems with multiple measurement vectors., Proceedings of the 8th IEEE Digital Signal Processing Workshop.
  • [48] Rockafellar, R. T. (1970)., Convex analysis. Princeton Mathematical Series. Princeton University Press.
  • [49] Rohn, J. (2000). Computing the $ norm is NP-hard., Linear and Multilinear Algebra 47 195–204.
  • [50] Siebert, J. P. (1987). Vehicle recognition using rule based methods. Turing Institute Research Memorandum, TIRM-87-018.
  • [51] So, A. M.-C. (2009). Improved approximation bound for quadratic optimization problems with orthogonality constraints., Symposium on Discrete Algorithms.
  • [52] Stoer, J. and Bulirsch, R. (2002)., Introduction to numerical analysis. Springer, New York, NY.
  • [53] Tukey, J. W. (1960). A survey of sampling from contaminated distributions. In, Contributions to probability and statistics: essays in honor of Harold Hotelling ( I. Olkin, ed.) 448–474. Stanford University Press, Stanford, CA.
  • [54] Watson, G. (1992). Characterization of the subdifferential of some matrix norms., Linear Algebra Appl. 170 33–45.
  • [55] Xu, H., Caramanis, C. and Mannor, S. (2010). Principal component analysis with contaminated data: the high dimensional case. In, COLT 2010 1–37.
  • [56] Xu, H., Caramanis, C. and Sanghavi, S. (2010). Robust PCA via outlier pursuit., IEEE Trans. Inform. Theory, to appear 1–24. arXiv:1010.4237.
  • [57] Xu, H., Caramanis, C. and Sanghavi, S. (2010). Robust PCA via outlier pursuit. In, NIPS 23 ( J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel and A. Culotta, eds.) 2496–2504.