The Annals of Statistics

Optimal rank-based tests for homogeneity of scatter

Marc Hallin and Davy Paindaveine

Full-text: Open access

Abstract

We propose a class of locally and asymptotically optimal tests, based on multivariate ranks and signs for the homogeneity of scatter matrices in m elliptical populations. Contrary to the existing parametric procedures, these tests remain valid without any moment assumptions, and thus are perfectly robust against heavy-tailed distributions (validity robustness). Nevertheless, they reach semiparametric efficiency bounds at correctly specified elliptical densities and maintain high powers under all (efficiency robustness). In particular, their normal-score version outperforms traditional Gaussian likelihood ratio tests and their pseudo-Gaussian robustifications under a very broad range of non-Gaussian densities including, for instance, all multivariate Student and power-exponential distributions.

Article information

Source
Ann. Statist., Volume 36, Number 3 (2008), 1261-1298.

Dates
First available in Project Euclid: 26 May 2008

Permanent link to this document
https://projecteuclid.org/euclid.aos/1211819564

Digital Object Identifier
doi:10.1214/07-AOS508

Mathematical Reviews number (MathSciNet)
MR2418657

Zentralblatt MATH identifier
1360.62288

Subjects
Primary: 62M15: Spectral analysis 62G35: Robustness

Keywords
Elliptic densities scatter matrix shape matrix local asymptotic normality semiparametric efficiency adaptivity

Citation

Hallin, Marc; Paindaveine, Davy. Optimal rank-based tests for homogeneity of scatter. Ann. Statist. 36 (2008), no. 3, 1261--1298. doi:10.1214/07-AOS508. https://projecteuclid.org/euclid.aos/1211819564


Export citation

References

  • [1] Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley, Hoboken, NJ.
  • [2] Bartlett, M. S. (1937). Properties of sufficiency and statistical tests. Proc. Roy. London Soc. Ser. A 160 268–282.
  • [3] Bartlett, M. S. and Kendall, D. G. (1946). The statistical analysis of variance-heterogeneity and the logarithmic transformation. Suppl. J. Roy. Statist. Soc. 8 128–138.
  • [4] Bickel, P. J. (1982). On adaptive estimation. Ann. Statist. 10 647–671.
  • [5] Box, G. E. P. (1953). Non-normality and tests on variances. Biometrika 40 318–335.
  • [6] Cochran, W. G. (1941). The distribution of the largest of a set of estimated variances as a fraction of their total. Ann. Eugenics 11 47–52.
  • [7] Conover, W. J., Johnson, M. E. and Johnson, M. M. (1981). Comparative study of tests for homogeneity of variances, with applications to the outer continental shelf bidding data. Technometrics 23 351–361.
  • [8] Dümbgen, L. (1998). On Tyler’s M-functional of scatter in high dimension. Ann. Inst. Statist. Math. 50 471–491.
  • [9] Dümbgen, L. and Tyler, D. E. (2005). On the breakdown properties of some multivariate M-functionals. Scand. J. Statist. 32 247–264.
  • [10] Fligner, M. A. and Killeen, T. J. (1976). Distribution-free two-sample tests for scale. J. Amer. Statist. Assoc. 71 210–213.
  • [11] Goodnight, C. J. and Schwartz, J. M. (1997). A bootstrap comparison of genetic covariance matrices. Biometrics 53 1026–1039.
  • [12] Gupta, A. K. and Xu, J. (2006). On some tests of the covariance matrix under general conditions. Ann. Inst. Statist. Math. 58 101–114.
  • [13] Hájek, I. (1968). Asymptotic normality of simple linear rank statistics under alternatives. Ann. Math. Statist. 39 325–346.
  • [14] Hallin, M., Oja, H. and Paindaveine, D. (2006). Semiparametrically efficient rank-based inference for shape. II. Optimal R-estimation of shape. Ann. Statist. 34 2757–2789.
  • [15] Hallin, M. and Paindaveine, D. (2002). Optimal tests for multivariate location based on interdirections and pseudo-Mahalanobis ranks. Ann. Statist. 30 1103–1133.
  • [16] Hallin, M. and Paindaveine, D. (2004). Rank-based optimal tests of the adequacy of an elliptic VARMA model. Ann. Statist. 32 2642–2678.
  • [17] Hallin, M. and Paindaveine, D. (2006). Semiparametrically efficient rank-based inference for shape. I. Optimal rank-based tests for sphericity. Ann. Statist. 34 2707–2756.
  • [18] Hallin, M. and Paindaveine, D. (2006). Parametric and semiparametric inference for shape: The role of the scale functional. Statist. Decisions 24 1001–1023.
  • [19] Hallin, M. and Paindaveine, D. (2007). Optimal tests for homogeneity of covariance, scale, and shape. J. Multivariate Anal. To appear.
  • [20] Hallin, M. and Werker, B. J. M. (2003). Semiparametric efficiency, distribution-freeness, and invariance. Bernoulli 9 137–165.
  • [21] Hartley, H. O. (1950). The maximum F-ratio as a shortcut test for heterogeneity of variance. Biometrika 37 308–312.
  • [22] Heritier, S. and Ronchetti, E. (1994). Robust bounded-influence tests in general parametric models. J. Amer. Statist. Assoc. 89 897–904.
  • [23] Hettmansperger, T. P. and Randles, R. H. (2002). A practical affine equivariant multivariate median. Biometrika 89 851–860.
  • [24] Jurečková, J. (1969). Asymptotic linearity of a rank statistic in regression parameter. Ann. Math. Statist. 40 1889–1900.
  • [25] Kreiss, J. P. (1987). On adaptive estimation in stationary ARMA processes. Ann. Statist. 15 112–133.
  • [26] Le Cam, L. (1986). Asymptotic Methods in Statistical Decision Theory. Springer, New York.
  • [27] Nagao, H. (1973). On some test criteria for covariance matrix. Ann. Statist. 1 700–709.
  • [28] Ollila, E., Hettmansperger, T. P. and Oja, H. (2004). Affine equivariant multivariate sign methods. Preprint, Univ. Jyväskylä.
  • [29] Paindaveine, D. (2006). A Chernoff–Savage result for shape. On the non-admissibility of pseudo-Gaussian methods. J. Multivariate Anal. 97 2206–2220.
  • [30] Paindaveine, D. (2007). A canonical definition of shape. Submitted.
  • [31] Perlman, M. D. (1980). Unbiasedness of the likelihood ratio tests for equality of several covariance matrices and equality of several multivariate normal populations. Ann. Statist. 8 247–263.
  • [32] Puri, M. L. and Sen, P. K. (1985). Nonparametric Methods in General Linear Models. Wiley, New York.
  • [33] Randles, R. H. (2000). A simpler, affine-invariant, multivariate, distribution-free sign test. J. Amer. Statist. Assoc. 95 1263–1268.
  • [34] Salibian-Barrera, M., Van Aelst, S. and Willems, G. (2006). Principal components analysis based on multivariate MM-estimators with fast and robust bootstrap. J. Amer. Statist. Assoc. 101 1198–1211.
  • [35] Schott, J. R. (2001). Some tests for the equality of covariance matrices. J. Statist. Plann. Inference 94 25–36.
  • [36] Taskinen, S., Croux, C., Kankainen, A., Ollila, E. and Oja, H. (2006). Influence functions and efficiencies of the canonical correlation and vector estimates based on scatter and shape matrices. J. Multivariate Anal. 97 359–384.
  • [37] Tatsuoka, K. S. and Tyler, D. E. (2000). On the uniqueness of S-functionals and M-functionals under nonelliptical distributions. Ann. Statist. 28 1219–1243.
  • [38] Tyler, D. E. (1983). Robustness and efficiency properties of scatter matrices. Biometrika 70 411–420.
  • [39] Tyler, D. E. (1987). A distribution-free M-estimator of multivariate scatter. Ann. Statist. 15 234–251.
  • [40] Um, Y. and Randles, R. H. (1998). Nonparametric tests for the multivariate multi-sample location problem. Statist. Sinica 8 801–812.
  • [41] Yanagihara, H., Tonda, T. and Matsumoto, C. (2005). The effects of non-normality on asymptotic distributions of some likelihood ratio criteria for testing covariance structures under normal assumption. J. Multivariate Anal. 96 237–264.
  • [42] Zhang, J. and Boos, D. D. (1992). Bootstrap critical values for testing homogeneity of covariance matrices. J. Amer. Statist. Assoc. 87 425–429.
  • [43] Zhu, L. X., Ng, K. W. and Jing, P. (2002). Resampling methods for homogeneity tests of covariance matrices. Statist. Sinica 12 769–783.