## Bernoulli

### Efficient and fast estimation of the geometric median in Hilbert spaces with an averaged stochastic gradient algorithm

#### Abstract

With the progress of measurement apparatus and the development of automatic sensors, it is not unusual anymore to get large samples of observations taking values in high-dimension spaces, such as functional spaces. In such large samples of high-dimensional data, outlying curves may not be uncommon, and even a few individuals may corrupt simple statistical indicators, such as the mean trajectory. We focus here on the estimation of the geometric median which is a direct generalization of the real median in metric spaces and has nice robustness properties. It is possible to estimate the geometric median, being defined as the minimizer of a simple convex functional that is differentiable everywhere when the distribution has no atom, with online gradient algorithms. Such algorithms are very fast and can deal with large samples. Furthermore, they also can be simply updated when the data arrive sequentially. We state the almost sure consistency and the $L^{2}$ rates of convergence of the stochastic gradient estimator as well as the asymptotic normality of its averaged version. We get that the asymptotic distribution of the averaged version of the algorithm is the same as the classic estimators, which are based on the minimization of the empirical loss function. The performances of our averaged sequential estimator, both in terms of computation speed and accuracy of the estimations, are evaluated with a small simulation study. Our approach is also illustrated on a sample of more than 5000 individual television audiences measured every second over a period of 24 hours.

#### Article information

Source
Bernoulli, Volume 19, Number 1 (2013), 18-43.

Dates
First available in Project Euclid: 18 January 2013

https://projecteuclid.org/euclid.bj/1358531739

Digital Object Identifier
doi:10.3150/11-BEJ390

Mathematical Reviews number (MathSciNet)
MR3019484

Zentralblatt MATH identifier
1259.62068

#### Citation

Cardot, Hervé; Cénac, Peggy; Zitt, Pierre-André. Efficient and fast estimation of the geometric median in Hilbert spaces with an averaged stochastic gradient algorithm. Bernoulli 19 (2013), no. 1, 18--43. doi:10.3150/11-BEJ390. https://projecteuclid.org/euclid.bj/1358531739

#### References

• [1] Arnaudon, M., Dombry, C., Phan, A. and Yang, L. (2010). Stochastic algorithms for computing means of probability measures. Preprint. Available at http://hal.archives-ouvertes.fr/hal-00540623/PDF/algo_means4.pdf.
• [2] Benveniste, A., Métivier, M. and Priouret, P. (1990). Adaptive Algorithms and Stochastic Approximations. Applications of Mathematics (New York) 22. Berlin: Springer.
• [3] Bottou, L. (2010). Large-scale machine learning with stochastic gradient descent. In Compstat 2010 (Y. Lechevallier and G. Saporta, eds.) 177–186. Heidelberg: Springer.
• [4] Cadre, B. (2001). Convergent estimators for the $L_{1}$-median of a Banach valued random variable. Statistics 35 509–521.
• [5] Cardot, H., Cénac, P. and Chaouch, M. (2010). Stochastic approximation to the multivariate and the functional median. In Compstat 2010 (Y. Lechevallier and G. Saporta, eds.) 421–428. Heidelberg: Springer.
• [6] Cardot, H., Cénac, P. and Monnez, J.M. (2012). A fast and recursive algorithm for clustering large datasets with $k$-medians. Comput. Statist. Data Anal. 56 1434–1449.
• [7] Chaouch, M. and Goga, C. (2010). Design-based estimation for geometric quantiles with application to outlier detection. Comput. Statist. Data Anal. 54 2214–2229.
• [8] Chaudhuri, P. (1992). Multivariate location estimation using extension of $R$-estimates through $U$-statistics type approach. Ann. Statist. 20 897–916.
• [9] Chaudhuri, P. (1996). On a geometric notion of quantiles for multivariate data. J. Amer. Statist. Assoc. 91 862–872.
• [10] Cuevas, A., Febrero, M. and Fraiman, R. (2007). Robust estimation and classification for functional data via projection-based depth notions. Comput. Statist. 22 481–496.
• [11] Dippon, J. and Walk, H. (2006). The averaged Robbins–Monro method for linear problems in a Banach space. J. Theoret. Probab. 19 166–189.
• [12] Duflo, M. (1997). Random Iterative Models. Applications of Mathematics (New York) 34. Berlin: Springer. Translated from the 1990 French original by Stephen S. Wilson and revised by the author.
• [13] Fraiman, R. and Muniz, G. (2001). Trimmed means for functional data. Test 10 419–440.
• [14] Gervini, D. (2008). Robust functional estimation using the median and spherical principal components. Biometrika 95 587–600.
• [15] Gower, J.C. (1974). Algorithm as 78: The mediancentre. J. R. Stat. Soc. Ser. C Appl. Stat. 23 466–470.
• [16] Haberman, S.J. (1989). Concavity and estimation. Ann. Statist. 17 1631–1661.
• [17] Haldane, J.B.S. (1948). Note on the median of a multivariate distribution. Biometrika 35 414–417.
• [18] Huber, P.J. and Ronchetti, E.M. (2009). Robust Statistics, 2nd ed. Wiley Series in Probability and Statistics. Hoboken, NJ: Wiley.
• [19] Jakubowski, A. (1988). Tightness criteria for random measures with application to the principle of conditioning in Hilbert spaces. Probab. Math. Statist. 9 95–114.
• [20] Kemperman, J.H.B. (1987). The median of a finite measure on a Banach space. In Statistical Data Analysis Based on the $L_{1}$-norm and Related Methods (Neuchâtel, 1987) 217–230. Amsterdam: North-Holland.
• [21] Koltchinskii, V.I. (1997). $M$-estimation, convexity and quantiles. Ann. Statist. 25 435–477.
• [22] Kushner, H.J. and Clark, D.S. (1978). Stochastic Approximation Methods for Constrained and Unconstrained Systems. Applied Mathematical Sciences 26. New York: Springer.
• [23] Kushner, H.J. and Yin, G.G. (2003). Stochastic Approximation and Recursive Algorithms and Applications, 2nd ed. Applications of Mathematics (New York) 35. New York: Springer.
• [24] Li, W.V. and Shao, Q.M. (2001). Gaussian processes: Inequalities, small ball probabilities and applications. In Stochastic Processes: Theory and Methods. Handbook of Statist. 19 533–597. Amsterdam: North-Holland.
• [25] Ljung, L., Pflug, G. and Walk, H. (1992). Stochastic Approximation and Optimization of Random Systems. DMV Seminar 17. Basel: Birkhäuser.
• [26] MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proc. Fifth Berkeley Sympos. Math. Statist. and Probability (Berkeley, Calif., 1965/66), Vol. I: Statistics 281–297. Berkeley, CA: Univ. California Press.
• [27] Nazarov, A.I. (2009). Exact $L_{2}$-small ball asymptotics of Gaussian processes and the spectrum of boundary-value problems. J. Theoret. Probab. 22 640–665.
• [28] Pelletier, M. (2000). Asymptotic almost sure efficiency of averaged stochastic algorithms. SIAM J. Control Optim. 39 49–72 (electronic).
• [29] Polyak, B.T. and Juditsky, A.B. (1992). Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30 838–855.
• [30] R Development Core Team (2010). A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
• [31] Ruppert, D. (1985). A Newton–Raphson version of the multivariate Robbins–Monro procedure. Ann. Statist. 13 236–245.
• [32] Smale, S. and Yao, Y. (2006). Online learning algorithms. Found. Comput. Math. 6 145–170.
• [33] Small, C.G. (1990). A survey of multidimensional medians. International Statistical Review/Revue Internationale de Statistique 58 263–277.
• [34] Vardi, Y. and Zhang, C.H. (2000). The multivariate $L_{1}$-median and associated data depth. Proc. Natl. Acad. Sci. USA 97 1423–1426 (electronic).
• [35] Walk, H. (1977). An invariance principle for the Robbins–Monro process in a Hilbert space. Z. Wahrsch. Verw. Gebiete 39 135–150.