## The Annals of Statistics

### Mixture inner product spaces and their application to functional data analysis

#### Abstract

We introduce the concept of mixture inner product spaces associated with a given separable Hilbert space, which feature an infinite-dimensional mixture of finite-dimensional vector spaces and are dense in the underlying Hilbert space. Any Hilbert valued random element can be arbitrarily closely approximated by mixture inner product space valued random elements. While this concept can be applied to data in any infinite-dimensional Hilbert space, the case of functional data that are random elements in the $L^{2}$ space of square integrable functions is of special interest. For functional data, mixture inner product spaces provide a new perspective, where each realization of the underlying stochastic process falls into one of the component spaces and is represented by a finite number of basis functions, the number of which corresponds to the dimension of the component space. In the mixture representation of functional data, the number of included mixture components used to represent a given random element in $L^{2}$ is specifically adapted to each random trajectory and may be arbitrarily large. Key benefits of this novel approach are, first, that it provides a new perspective on the construction of a probability density in function space under mild regularity conditions, and second, that individual trajectories possess a trajectory-specific dimension that corresponds to a latent random variable, making it possible to use a larger number of components for less smooth and a smaller number for smoother trajectories. This enables flexible and parsimonious modeling of heterogeneous trajectory shapes. We establish estimation consistency of the functional mixture density and introduce an algorithm for fitting the functional mixture model based on a modified expectation-maximization algorithm. Simulations confirm that in comparison to traditional functional principal component analysis the proposed method achieves similar or better data recovery while using fewer components on average. Its practical merits are also demonstrated in an analysis of egg-laying trajectories for medflies.

#### Article information

Source
Ann. Statist., Volume 46, Number 1 (2018), 370-400.

Dates
Revised: January 2017
First available in Project Euclid: 22 February 2018

https://projecteuclid.org/euclid.aos/1519268434

Digital Object Identifier
doi:10.1214/17-AOS1553

Mathematical Reviews number (MathSciNet)
MR3766956

Zentralblatt MATH identifier
06865115

Subjects
Primary: 62G05: Estimation 62G08: Nonparametric regression

#### Citation

Lin, Zhenhua; Müller, Hans-Georg; Yao, Fang. Mixture inner product spaces and their application to functional data analysis. Ann. Statist. 46 (2018), no. 1, 370--400. doi:10.1214/17-AOS1553. https://projecteuclid.org/euclid.aos/1519268434

#### References

• Benaglia, T., Chauveau, D. and Hunter, D. R. (2009). An EM-like algorithm for semi- and nonparametric estimation in multivariate mixtures. J. Comput. Graph. Statist. 18 505–526.
• Besse, P. and Ramsay, J. O. (1986). Principal components analysis of sampled functions. Psychometrika 51 285–311.
• Boente, G. and Fraiman, R. (2000). Kernel-based functional principal components. Statist. Probab. Lett. 48 335–345.
• Bongiorno, E. G. and Goia, A. (2016). Some insights about the small ball probability factorization for Hilbert random elements. Preprint. Available at arXiv:1501.04308v2.
• Carey, J. R., Liedo, P., Müller, H.-G., Wang, J.-L. and Chiou, J.-M. (1998). Relationship of age patterns of fecundity to mortality, longevity, and lifetime reproduction in a large, cohort of Mediterranean fruit fly females. J. Gerontol., Ser. A, Biol. Sci. Med. Sci. 53 B245–B251.
• Castro, P. E., Lawton, W. H. and Sylvestre, E. A. (1986). Principal modes of variation for processes with continuous sample curves. Technometrics 28 329–337.
• Chen, K. and Lei, J. (2015). Localized functional principal component analysis. J. Amer. Statist. Assoc. 110 1266–1275.
• Chiou, J.-M. and Li, P.-L. (2007). Functional clustering and identifying substructures of longitudinal data. J. R. Stat. Soc. Ser. B. Stat. Methodol. 69 679–699.
• Dabo-Niang, S. (2002). Estimation de la densité dans un espace de dimension infinie: Application aux diffusions. C. R. Math. Acad. Sci. Paris 334 213–216.
• Dauxois, J., Pousse, A. and Romain, Y. (1982). Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference. J. Multivariate Anal. 12 136–154.
• Delaigle, A. and Hall, P. (2010). Defining probability density for a distribution of random functions. Ann. Statist. 38 1171–1193.
• Gasser, T., Hall, P. and Presnell, B. (1998). Nonparametric estimation of the mode of a distribution of random curves. J. R. Stat. Soc. Ser. B. Stat. Methodol. 60 681–691.
• Gikhman, I. I. and Skorokhod, A. V. (1969). Introduction to the Theory of Random Processes. W. B. Saunders Co., Philadelphia, PA.
• Grenander, U. (1950). Stochastic processes and statistical inference. Ark. Mat. 1 195–277.
• Hall, P. and Horowitz, J. L. (2007). Methodology and convergence rates for functional linear regression. Ann. Statist. 35 70–91.
• Hall, P. and Hosseini-Nasab, M. (2006). On properties of functional principal components analysis. J. R. Stat. Soc. Ser. B. Stat. Methodol. 68 109–126.
• Hall, P. and Hosseini-Nasab, M. (2009). Theory for high-order bounds in functional principal components analysis. Math. Proc. Cambridge Philos. Soc. 146 225–256.
• Hall, P., Müller, H.-G. and Wang, J.-L. (2006). Properties of principal component methods for functional and longitudinal data analysis. Ann. Statist. 34 1493–1517.
• Hall, P. and Vial, C. (2006). Assessing the finite dimensionality of functional data. J. R. Stat. Soc. Ser. B. Stat. Methodol. 68 689–705.
• Hsing, T. and Eubank, R. (2015). Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators. Wiley, Chichester.
• Jacques, J. and Preda, C. (2014). Model-based clustering for multivariate functional data. Comput. Statist. Data Anal. 71 92–106.
• Kneip, A. and Utikal, K. J. (2001). Inference for density families using functional principal component analysis. J. Amer. Statist. Assoc. 96 519–542.
• Levine, M., Hunter, D. R. and Chauveau, D. (2011). Maximum smoothed likelihood for multivariate mixtures. Biometrika 98 403–416.
• Li, Y. and Guan, Y. (2014). Functional principal component analysis of spatiotemporal point processes with applications in disease surveillance. J. Amer. Statist. Assoc. 109 1205–1215.
• Li, Y. and Hsing, T. (2010). Uniform convergence rates for nonparametric regression and principal component analysis in functional/longitudinal data. Ann. Statist. 38 3321–3351.
• Li, W. V. and Linde, W. (1999). Approximation, metric entropy and small ball estimates for Gaussian measures. Ann. Probab. 27 1556–1578.
• Li, Y., Wang, N. and Carroll, R. J. (2013). Selecting the number of principal components in functional data. J. Amer. Statist. Assoc. 108 1284–1294.
• Liu, X. and Müller, H.-G. (2003). Modes and clustering for time-warped gene expression profile data. Bioinformatics 19 1937–1944.
• Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer, New York.
• Rao, C. R. (1958). Some statistical methods for comparison of growth curves. Biometrics 14 1–17.
• Rice, J. A. and Silverman, B. W. (1991). Estimating the mean and covariance structure nonparametrically when the data are curves. J. R. Stat. Soc. Ser. B. Stat. Methodol. 53 233–243.
• Shiryayev, A. N. (1984). Probability. Graduate Texts in Mathematics 95. Springer, New York.
• Silverman, B. W. (1996). Smoothed functional principal components analysis by choice of norm. Ann. Statist. 24 1–24.
• Slaets, L., Claeskens, G. and Hubert, M. (2012). Phase and amplitude-based clustering for functional data. Comput. Statist. Data Anal. 56 2360–2374.
• Sung, S. H. (1999). Weak law of large numbers for arrays of random variables. Statist. Probab. Lett. 42 293–298.
• Vakhania, N. N., Tarieladze, V. I. and Chobanyan, S. A. (1987). Probability Distributions on Banach Spaces. Mathematics and Its Applications (Soviet Series) 14. D. Reidel, Dordrecht.
• Yao, F., Müller, H.-G. and Wang, J.-L. (2005). Functional data analysis for sparse longitudinal data. J. Amer. Statist. Assoc. 100 577–590.
• Zhang, X. and Wang, J.-L. (2016). From sparse to dense functional data and beyond. Ann. Statist. 44 2281–2321.