## Bernoulli

• Bernoulli
• Volume 25, Number 2 (2019), 932-976.

### Fréchet means and Procrustes analysis in Wasserstein space

#### Abstract

We consider two statistical problems at the intersection of functional and non-Euclidean data analysis: the determination of a Fréchet mean in the Wasserstein space of multivariate distributions; and the optimal registration of deformed random measures and point processes. We elucidate how the two problems are linked, each being in a sense dual to the other. We first study the finite sample version of the problem in the continuum. Exploiting the tangent bundle structure of Wasserstein space, we deduce the Fréchet mean via gradient descent. We show that this is equivalent to a Procrustes analysis for the registration maps, thus only requiring successive solutions to pairwise optimal coupling problems. We then study the population version of the problem, focussing on inference and stability: in practice, the data are i.i.d. realisations from a law on Wasserstein space, and indeed their observation is discrete, where one observes a proxy finite sample or point process. We construct regularised nonparametric estimators, and prove their consistency for the population mean, and uniform consistency for the population Procrustes registration maps.

#### Article information

Source
Bernoulli, Volume 25, Number 2 (2019), 932-976.

Dates
Revised: November 2017
First available in Project Euclid: 6 March 2019

https://projecteuclid.org/euclid.bj/1551862840

Digital Object Identifier
doi:10.3150/17-BEJ1009

Mathematical Reviews number (MathSciNet)
MR3920362

Zentralblatt MATH identifier
07049396

#### Citation

Zemel, Yoav; Panaretos, Victor M. Fréchet means and Procrustes analysis in Wasserstein space. Bernoulli 25 (2019), no. 2, 932--976. doi:10.3150/17-BEJ1009. https://projecteuclid.org/euclid.bj/1551862840

#### References

• [1] Afsari, B., Tron, R. and Vidal, R. (2013). On the convergence of gradient descent for finding the Riemannian center of mass. SIAM J. Control Optim. 51 2230–2260.
• [2] Agueh, M. and Carlier, G. (2011). Barycenters in the Wasserstein space. SIAM J. Math. Anal. 43 904–924.
• [3] Alberti, G. and Ambrosio, L. (1999). A geometrical approach to monotone functions in ${\mathbf{R}}^{n}$. Math. Z. 230 259–316.
• [4] Allassonnière, S., Amit, Y. and Trouvé, A. (2007). Towards a coherent statistical framework for dense deformable template estimation. J. R. Stat. Soc. Ser. B. Stat. Methodol. 69 3–29.
• [5] Álvarez-Esteban, P.C., del Barrio, E., Cuesta-Albertos, J.A. and Matrán, C. (2011). Uniqueness and approximate computation of optimal incomplete transportation plans. Ann. Inst. Henri Poincaré Probab. Stat. 47 358–375.
• [6] Álvarez-Esteban, P.C., del Barrio, E., Cuesta-Albertos, J.A. and Matrán, C. (2016). A fixed-point approach to barycenters in Wasserstein space. J. Math. Anal. Appl. 441 744–762.
• [7] Ambrosio, L., Gigli, N. and Savaré, G. (2008). Gradient Flows in Metric Spaces and in the Space of Probability Measures, 2nd ed. London: Springer.
• [8] Amit, Y., Grenander, U. and Piccioni, M. (1991). Structural image restoration through deformable templates. J. Amer. Statist. Assoc. 86 376–387.
• [9] Anderes, E., Borgwardt, S. and Miller, J. (2016). Discrete Wasserstein barycenters: Optimal transport for discrete data. Math. Methods Oper. Res. 1–21.
• [10] Benamou, J.-D. and Brenier, Y. (2000). A computational fluid mechanics solution to the Monge–Kantorovich mass transfer problem. Numer. Math. 84 375–393.
• [11] Benamou, J.-D., Carlier, G., Cuturi, M., Nenna, L. and Peyré, G. (2015). Iterative Bregman projections for regularized transportation problems. SIAM J. Sci. Comput. 37 A1111–A1138.
• [12] Bickel, P.J. and Freedman, D.A. (1981). Some asymptotic theory for the bootstrap. Ann. Statist. 1196–1217.
• [13] Bigot, J., Gouet, R., Klein, T. and López, A. (2013). Geodesic PCA in the Wasserstein space. Preprint. Available at arXiv:1307.7721.
• [14] Bigot, J. and Klein, T. (2012). Consistent estimation of a population barycenter in the wasserstein space. ArXiv e-prints.
• [15] Billingsley, P. (1999). Convergence of Probability Measures, 2nd ed. New York: Wiley.
• [16] Boissard, E., Le Gouic, T., Loubes, J.-M. et al. (2015). Distribution’s template estimate with Wasserstein metrics. Bernoulli 21 740–759.
• [17] Bolstad, B.M., Irizarry, R.A., Åstrand, M. and Speed, T.P. (2003). A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19 185–193.
• [18] Bonneel, N., Peyré, G. and Cuturi, M. (2016). Wasserstein barycentric coordinates: Histogram regression using optimal transport. ACM Trans. Graph. 35 71–1.
• [19] Bonneel, N., Rabin, J., Peyré, G. and Pfister, H. (2015). Sliced and Radon Wasserstein barycenters of measures. J. Math. Imaging Vision 51 22–45.
• [20] Bookstein, F.L. (1997). Morphometric Tools for Landmark Data: Geometry and Biology. Cambridge: Cambridge Univ. Press.
• [21] Caffarelli, L.A. (1992). The regularity of mappings with a convex potential. J. Amer. Math. Soc. 5 99–104.
• [22] Carlier, G., Oberman, A. and Oudet, É. (2015). Numerical methods for matching for teams and Wasserstein barycenters. ESAIM Math. Model. Numer. Anal. 49 1621–1642.
• [23] Chartrand, R., Wohlberg, B., Vixie, K.R. and Bollt, E.M. (2009). A gradient descent solution to the Monge–Kantorovich problem. Appl. Math. Sci. (Ruse) 3 1071–1080.
• [24] Chiu, S.N., Stoyan, D., Kendall, W.S. and Mecke, J. (2013). Stochastic Geometry and Its Applications. New York: Wiley.
• [25] Cuesta-Albertos, J.A., Matrán, C. and Tuero-Diaz, A. (1997). Optimal transportation plans and convergence in distribution. J. Multivariate Anal. 60 72–83.
• [26] Cuturi, M. and Doucet, A. (2014). Fast computation of Wasserstein barycenters. Proceedings of the International Conference on Machine Learning 2014, JMLR W&CP 32 685–693.
• [27] Cuturi, M. and Peyré, G. (2016). A smoothed dual approach for variational Wasserstein problems. SIAM J. Imaging Sci. 9 320–343.
• [28] Dowson, D. and Landau, B. (1982). The Fréchet distance between multivariate normal distributions. J. Multivariate Anal. 12 450–455.
• [29] Dryden, I.L. and Mardia, K.V. (1998). Statistical Shape Analysis. Chichester: Wiley.
• [30] Fiedler, M. (1971). Bounds for the determinant of the sum of Hermitian matrices. Proc. Amer. Math. Soc. 27–31.
• [31] Fréchet, M. (1948). Les éléments aléatoires de nature quelconque dans un espace distancié. Ann. Inst. H. Poincaré 10 215–310.
• [32] Fréchet, M. (1957). Sur la distance de deux lois de probabilité. C. R. Math. Acad. Sci. Paris 244 689–692.
• [33] Freitag, G. and Munk, A. (2005). On Hadamard differentiability in $k$-sample semiparametric models – With applications to the assessment of structural relationships. J. Multivariate Anal. 94 123–158.
• [34] Gallón, S., Loubes, J.-M. and Maza, E. (2013). Statistical properties of the quantile normalization method for density curve alignment. Math. Biosci. 242 129–142.
• [35] Gangbo, W. and Świȩch, A. (1998). Optimal maps for the multidimensional Monge–Kantorovich problem. Comm. Pure Appl. Math. 51 23–45.
• [36] Goodall, C. (1991). Procrustes methods in the statistical analysis of shape. J. R. Stat. Soc. Ser. B. Stat. Methodol. 285–339.
• [37] Gower, J.C. (1975). Generalized Procrustes analysis. Psychometrika 40 33–51.
• [38] Groisser, D. (2005). On the convergence of some Procrustean averaging algorithms. Stochastics 77 31–60.
• [39] Haber, E., Rehman, T. and Tannenbaum, A. (2010). An efficient numerical method for the solution of the $L_{2}$ optimal mass transfer problem. SIAM J. Sci. Comput. 32 197–211.
• [40] Hsing, T. and Eubank, R. (2015). Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators. Chichester: Wiley.
• [41] Huckemann, S., Hotz, T. and Munk, A. (2010). Intrinsic shape analysis: Geodesic PCA for Riemannian manifolds modulo isometric Lie group actions. Statist. Sinica 20 1–58.
• [42] Huckemann, S. and Ziezold, H. (2006). Principal component analysis for Riemannian manifolds, with an application to triangular shape spaces. Adv. in Appl. Probab. 299–319.
• [43] Kallenberg, O. (1986). Random Measures, 4th ed. Berlin: Akademie-Verlag.
• [44] Karcher, H. (1977). Riemannian center of mass and mollifier smoothing. Comm. Pure Appl. Math. 30 509–541.
• [45] Kendall, W.S. (2010). A survey of Riemannian centres of mass for data. In Proceedings 59th ISI World Statistics Congress.
• [46] Kendall, W.S. and Le, H. (2011). Limit theorems for empirical Fréchet means of independent and non-identically distributed manifold-valued random variables. Braz. J. Probab. Stat. 25 323–352.
• [47] Krantz, S. (2014). Convex Analysis. Textbooks in Mathematics. Boca Raton: CRC Press.
• [48] Le, H. (1998). On the consistency of procrustean mean shapes. Adv. in Appl. Probab. 53–63.
• [49] Le, H. (2001). Locating Fréchet means with application to shape spaces. Adv. in Appl. Probab. 324–338.
• [50] Le, H.L. (1995). Mean size-and-shapes and mean shapes: A geometric point of view. Adv. in Appl. Probab. 27 44–55.
• [51] Le Gouic, T. and Loubes, J.-M. (2016). Existence and consistency of Wasserstein barycenters. Probab. Theory Related Fields 1–17.
• [52] McCann, R.J. (1997). A convexity principle for interacting gases. Adv. Math. 128 153–179.
• [53] Molchanov, I. and Zuyev, S. (2002). Steepest descent algorithms in a space of measures. Stat. Comput. 12 115–123.
• [54] Munk, A. and Czado, C. (1998). Nonparametric validation of similar distributions and assessment of goodness of fit. J. R. Stat. Soc. Ser. B. Stat. Methodol. 60 223–241.
• [55] Munk, A., Paige, R., Pang, J., Patrangenaru, V. and Ruymgaart, F. (2008). The one-and multi-sample problem for functional data with application to projective shape analysis. J. Multivariate Anal. 99 815–833.
• [56] Oberman, A.M. and Ruan, Y. (2015). An efficient linear programming method for optimal transportation. Preprint. Available at arXiv:1509.03668.
• [57] Olkin, I. and Pukelsheim, F. (1982). The distance between two random vectors with given dispersion matrices. Linear Algebra Appl. 48 257–263.
• [58] Panaretos, V.M. and Zemel, Y. (2016). Amplitude and phase variation of point processes. Ann. Statist. 44 771–812.
• [59] Pass, B. (2013). Optimal transportation with infinitely many marginals. J. Funct. Anal. 264 947–963.
• [60] Patrangenaru, V. and Ellingson, L. (2016). Nonparametric Statistics on Manifolds and Their Applications to Object Data Analysis. Boca Raton, FL: CRC Press.
• [61] Pollard, D. (2012). Convergence of Stochastic Processes. New York: Springer Science & Business Media.
• [62] Rippl, T., Munk, A. and Sturm, A. (2016). Limit laws of the empirical Wasserstein distance: Gaussian distributions. J. Multivariate Anal. 151 90–109.
• [63] Rockafellar, R.T. (1970). Convex Analysis. Princeton Mathematical Series, 28. Princeton, NJ: Princeton Univ. Press.
• [64] Rolet, A., Cuturi, M. and Peyré, G. (2016). Fast dictionary learning with a smoothed Wasserstein loss. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (A. Gretton and C.C. Robert, eds.). Proceedings of Machine Learning Research 51 630–638. Cadiz, Spain.
• [65] Schachermayer, W. and Teichmann, J. (2009). Characterization of optimal transport plans for the Monge–Kantorovich problem. Proc. Amer. Math. Soc. 137 519–529.
• [66] Solomon, J., De Goes, F., Peyré, G., Cuturi, M., Butscher, A., Nguyen, A., Du, T. and Guibas, L. (2015). Convolutional Wasserstein distances: Efficient optimal transportation on geometric domains. ACM Trans. Graph. 34 66.
• [67] Sommerfeld, M. and Munk, A. (2016). Inference for empirical Wasserstein distances on finite spaces. Preprint. Available at arXiv:1610.03287.
• [68] Stein, E.M. and Shakarchi, R. (2005). Real Analysis: Measure Theory, Integration, and Hilbert Spaces. Princeton Lectures in Analysis 3. Princeton, NJ: Princeton Univ. Press.
• [69] Tameling, C., Sommerfeld, M. and Munk, A. (2017). Empirical optimal transport on countable metric spaces: Distributional limits and statistical applications. Preprint. Available at arXiv:1707.00973.
• [70] Villani, C. (2003). Topics in Optimal Transportation 58. Providence: AMS.
• [71] Wang, W., Slepčev, D., Basu, S., Ozolek, J.A. and Rohde, G.K. (2013). A linear optimal transportation framework for quantifying and visualizing variations in sets of images. Int. J. Comput. Vis. 101 254–269.
• [72] Zemel, Y. and Panaretos, V.M. (2017). Supplement to “Fréchet means and Procrustes analysis in Wasserstein space.” DOI:10.3150/17-BEJ1009SUPP.
• [73] Zhang, X. and Wang, J.-L. (2016). From sparse to dense functional data and beyond. Ann. Statist. 44 2281–2321.

#### Supplemental materials

• Fréchet means and Procrustes analysis in Wasserstein space. The online supplement contains more details on the examples, additional technical material, as well as those proofs that were omitted from the main paper.