Abstract
For the problem of robust estimation of multivariate location and shape, defining S-estimators using scale transformations of a fixed $\rho$ function regardless of the dimension, as is usually done, leads to a perverse outcome: estimators in high dimension can have a breakdown point approaching 50%, but still fail to reject as outliers points that are large distances from the main mass of points. This leads to a form of nonrobustness that has important practical consequences. In this paper, estimators are defined that improve on known S-estimators in having all of the following properties: (1) maximal breakdown for the given sample size and dimension; (2) ability completely to reject as outliers points that are far from the main mass of points; (3) convergence to good solutions with a modest amount of computation from a nonrobust starting point for large (though not near 50%) contamination. However, to attain maximal breakdown, these estimates, like other known maximal breakdown estimators, require large amounts of computational effort. This greater ability of the new estimators to reject outliers comes at a modest cost in efficiency and gross error sensitivity and at a greater, but finite, cost in local shift sensitivity.
Citation
David M. Rocke. "Robustness properties of S-estimators of multivariate location and shape in high dimension." Ann. Statist. 24 (3) 1327 - 1345, June 1996. https://doi.org/10.1214/aos/1032526972
Information