## The Annals of Statistics

### Fréchet regression for random objects with Euclidean predictors

#### Abstract

Increasingly, statisticians are faced with the task of analyzing complex data that are non-Euclidean and specifically do not lie in a vector space. To address the need for statistical methods for such data, we introduce the concept of Fréchet regression. This is a general approach to regression when responses are complex random objects in a metric space and predictors are in $\mathcal{R}^{p}$, achieved by extending the classical concept of a Fréchet mean to the notion of a conditional Fréchet mean. We develop generalized versions of both global least squares regression and local weighted least squares smoothing. The target quantities are appropriately defined population versions of global and local regression for response objects in a metric space. We derive asymptotic rates of convergence for the corresponding fitted regressions using observed data to the population targets under suitable regularity conditions by applying empirical process methods. For the special case of random objects that reside in a Hilbert space, such as regression models with vector predictors and functional data as responses, we obtain a limit distribution. The proposed methods have broad applicability. Illustrative examples include responses that consist of probability distributions and correlation matrices, and we demonstrate both global and local Fréchet regression for demographic and brain imaging data. Local Fréchet regression is also illustrated via a simulation with response data which lie on the sphere.

#### Article information

Source
Ann. Statist., Volume 47, Number 2 (2019), 691-719.

Dates
Received: July 2016
Revised: June 2017
First available in Project Euclid: 11 January 2019

Permanent link to this document
https://projecteuclid.org/euclid.aos/1547197235

Digital Object Identifier
doi:10.1214/17-AOS1624

Mathematical Reviews number (MathSciNet)
MR3909947

Zentralblatt MATH identifier
07033148

Subjects
Primary: 62G05: Estimation
Secondary: 62J99: None of the above, but in this section 62G08: Nonparametric regression

#### Citation

Petersen, Alexander; Müller, Hans-Georg. Fréchet regression for random objects with Euclidean predictors. Ann. Statist. 47 (2019), no. 2, 691--719. doi:10.1214/17-AOS1624. https://projecteuclid.org/euclid.aos/1547197235

#### References

• Afsari, B. (2011). Riemannian $L^{p}$ center of mass: Existence, uniqueness, and convexity. Proc. Amer. Math. Soc. 139 655–673.
• Allen, E. A., Damaraju, E., Plis, S. M., Erhardt, E. B., Eichele, T. and Calhoun, V. D. (2014). Tracking whole-brain connectivity dynamics in the resting state. Cereb. Cortex 24 663–676.
• Arsigny, V., Fillard, P., Pennec, X. and Ayache, N. (2007). Geometric means in a novel vector space structure on symmetric positive-definite matrices. SIAM J. Matrix Anal. Appl. 29 328–347.
• Barden, D., Le, H. and Owen, M. (2013). Central limit theorems for Fréchet means in the space of phylogenetic trees. Electron. J. Probab. 18 no. 25.
• Bhattacharya, R. and Patrangenaru, V. (2003). Large sample theory of intrinsic and extrinsic sample means on manifolds. I. Ann. Statist. 31 1–29.
• Bhattacharya, R. N., Ellingson, L., Liu, X., Patrangenaru, V. and Crane, M. (2012). Extrinsic analysis on manifolds is computationally faster than intrinsic analysis with applications to quality control by machine vision. Appl. Stoch. Models Bus. Ind. 28 222–235.
• Borsdorf, R. and Higham, N. J. (2010). A preconditioned Newton algorithm for the nearest correlation matrix. IMA J. Numer. Anal. 30 94–107.
• Boumal, N., Mishra, B., Absil, P.-A., Sepulchre, R. et al. (2014). Manopt, a Matlab toolbox for optimization on manifolds. J. Mach. Learn. Res. 15 1455–1459.
• Bradley, J. V. (1968). Distribution-Free Statistical Tests. Prentice Hall, Englewood Cliffs, NJ.
• Chang, T. (1989). Spherical regression with errors in variables. Ann. Statist. 17 293–306.
• Cornea, E., Zhu, H., Kim, P. and Ibrahim, J. G. (2017). Regression models on Riemannian symmetric spaces. J. R. Stat. Soc. Ser. B. Stat. Methodol. 79 463–482.
• Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions. Numer. Math. 31 377–403.
• Davis, B. C., Fletcher, P. T., Bullitt, E. and Joshi, S. (2007). Population shape regression from random design data. In IEEE 11th International Conference on Computer Vision, ICCV 2007 1–7.
• Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with $B$-splines and penalties. Statist. Sci. 11 89–121.
• Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Chapman & Hall, London.
• Faraway, J. J. (1997). Regression analysis for a functional response. Technometrics 39 254–261.
• Faraway, J. J. (2014). Regression for non-Euclidean data using distance matrices. J. Appl. Stat. 41 2342–2357.
• Ferreira, L. K. and Busatto, G. F. (2013). Resting-state functional connectivity in normal brain aging. Neurosci. Biobehav. Rev. 37 384–400.
• Ferreira, R., Xavier, J., Costeira, J. P. and Barroso, V. (2013). Newton algorithms for Riemannian distance related problems on connected locally symmetric manifolds. IEEE J. Sel. Top. Signal Process. 7 634–645.
• Fisher, N. I. (1995). Statistical Analysis of Circular Data. Cambridge Univ. Press, Cambridge.
• Fisher, N. I., Lewis, T. and Embleton, B. J. (1987). Statistical Analysis of Spherical Data. Cambridge Univ. Press, Cambridge.
• Fletcher, P. T. (2013). Geodesic regression and the theory of least squares on Riemannian manifolds. Int. J. Comput. Vis. 105 171–185.
• Fréchet, M. (1948). Les éléments aléatoires de nature quelconque dans un espace distancié. Ann. Inst. Henri Poincaré 10 215–310.
• Hein, M. (2009). Robust nonparametric regression with metric-space valued output. In Advances in Neural Information Processing Systems 22 718–726.
• Higgins, J. J. (2004). An Introduction to Modern Nonparametric Statistics. Brooks/Cole, Pacific Grove, CA.
• Higham, N. J. (2002). Computing the nearest correlation matrix—A problem from finance. IMA J. Numer. Anal. 22 329–343.
• Hinkle, J., Muralidharan, P., Fletcher, P. T. and Joshi, S. (2012). Polynomial regression on Riemannian manifolds. In Computer Vision—ECCV 2012 1–14. Springer, Heidelberg.
• Le, H. and Barden, D. (2014). On the measure of the cut locus of a Fréchet mean. Bull. Lond. Math. Soc. 46 698–708.
• Lee, M., Smyser, C. and Shimony, J. (2013). Resting-state fMRI: A review of methods and clinical applications. Am. J. Neuroradiol. 34 1866–1872.
• Lehmann, E. L. and D’Abrera, H. J. (2006). Nonparametrics: Statistical Methods Based on Ranks. Springer, New York.
• Lin, L., Thomas, B. S., Zhu, H. and Dunson, D. B. (2015). Extrinsic local regression on manifold-valued data. Available at arXiv:1508.02201.
• Marron, J. S. and Alonso, A. M. (2014). Overview of object oriented data analysis. Biom. J. 56 732–753.
• Mevel, K., Landeau, B., Fouquet, M., La Joie, R., Villain, N., Mézenge, F., Perrotin, A., Eustache, F., Desgranges, B. and Chételat, G. (2013). Age effect on the default mode network, inner thoughts, and cognitive abilities. Neurobiol. Aging 34 1292–1301.
• Niethammer, M., Huang, Y. and Vialard, F.-X. (2011). Geodesic regression for image time-series. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2011 655–662. Springer, Berlin.
• Onoda, K., Ishihara, M. and Yamaguchi, S. (2012). Decreased functional connectivity by aging is associated with cognitive decline. J. Cogn. Neurosci. 24 2186–2198.
• Panaretos, V. M. and Zemel, Y. (2016). Amplitude and phase variation of point processes. Ann. Statist. 44 771–812.
• Patrangenaru, V. and Ellingson, L. (2015). Nonparametric Statistics on Manifolds and Their Applications to Object Data Analysis. CRC Press, Boca Raton, FL.
• Pelletier, B. (2006). Non-parametric regression estimation on closed Riemannian manifolds. J. Nonparametr. Stat. 18 57–67.
• Petersen, A. and Müller, H.-G. (2019). Supplement to “Fréchet regression for random objects with Euclidean predictors.” DOI:10.1214/17-AOS1624SUPP.
• Pigoli, D., Aston, J. A., Dryden, I. L. and Secchi, P. (2014). Distances and inference for covariance operators. Biometrika 101 409–422.
• Prentice, M. J. (1989). Spherical regression on matched pairs of orientation statistics. J. Roy. Statist. Soc. Ser. B 51 241–248.
• Qi, H. and Sun, D. (2006). A quadratically convergent Newton method for computing the nearest correlation matrix. SIAM J. Matrix Anal. Appl. 28 360–385.
• Sheline, Y. I. and Raichle, M. E. (2013). Resting state functional connectivity in preclinical Alzheimer’s disease. Biological Psychiatry 74 340–347.
• Shi, X., Styner, M., Lieberman, J., Ibrahim, J. G., Lin, W. and Zhu, H. (2009). Intrinsic regression models for manifold-valued data. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2009 192–199. Springer, Berlin.
• Steinke, F. and Hein, M. (2009). Non-parametric regression between manifolds. In Advances in Neural Information Processing Systems 1561–1568.
• Steinke, F., Hein, M. and Schölkopf, B. (2010). Nonparametric regression between general Riemannian manifolds. SIAM J. Imaging Sci. 3 527–563.
• Su, J., Dryden, I. L., Klassen, E., Le, H. and Srivastava, A. (2012). Fitting smoothing splines to time-indexed, noisy points on nonlinear manifolds. Image Vis. Comput. 30 428–442.
• Takatsu, A. (2011). Wasserstein geometry of Gaussian measures. Osaka J. Math. 48 1005–1026.
• Van der Vaart, A. and Wellner, J. (1996). Weak Convergence and Empirical Processes. Springer, New York.
• Wang, H. and Marron, J. S. (2007). Object oriented data analysis: Sets of trees. Ann. Statist. 35 1849–1873.
• Yuan, Y., Zhu, H., Lin, W. and Marron, J. S. (2012). Local polynomial regression for symmetric positive definite matrices. J. R. Stat. Soc. Ser. B. Stat. Methodol. 74 697–719.
• Ziezold, H. (1977). On expected figures and a strong law of large numbers for random elements in quasi-metric spaces. In Transactions of the Seventh Prague Conference on Information Theory, Statistical Decision Functions, Random Processes and of the 1974 European Meeting of Statisticians 591–602. Springer, Berlin.

#### Supplemental materials

• Proofs of theoretical results. The supplement includes four sections of proofs. The first section contains proofs of propositions verifying that our theoretical assumptions hold for the examples included in Section 3. The other three contain proofs for each of Sections 3–5.