The Annals of Statistics

Pareto quantiles of unlabeled tree objects

Ela Sienkiewicz and Haonan Wang

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


In this paper, we consider a set of unlabeled tree objects with topological and geometric properties. For each data object, two curve representations are developed to characterize its topological and geometric aspects. We further define the notions of topological and geometric medians as well as quantiles based on both representations. In addition, we take a novel approach to define the Pareto medians and quantiles through a multi-objective optimization problem. In particular, we study two different objective functions which measure the topological variation and geometric variation, respectively. Analytical solutions are provided for topological and geometric medians and quantiles, and in general, for Pareto medians and quantiles, the genetic algorithm is implemented. The proposed methods are applied to analyze a data set of pyramidal neurons.

Article information

Ann. Statist., Volume 46, Number 4 (2018), 1513-1540.

Received: November 2016
Revised: March 2017
First available in Project Euclid: 27 June 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G99: None of the above, but in this section
Secondary: 62P10: Applications to biology and medical sciences

Data object genetic algorithm multi-objective optimization object oriented data tree-structured data


Sienkiewicz, Ela; Wang, Haonan. Pareto quantiles of unlabeled tree objects. Ann. Statist. 46 (2018), no. 4, 1513--1540. doi:10.1214/17-AOS1593.

Export citation


  • [1] Antoniou, A. and Lu, W.-S. (2007). Practical Optimization: Algorithms and Engineering Applications. Springer, New York.
  • [2] Ascoli, G. A., Donohue, D. E. and Halavi, M. (2007). NeuroMorpho.Org: A central resource for neuronal morphologies. J. Neurosci. 27 9247–9251.
  • [3] Ascoli, G. A. and Krichmar, J. L. (2000). L-Neuron: A modeling tool for the efficient generation and parsimonious description of dendritic morphology. Neurocomputing 32 1003–1011.
  • [4] Ascoli, G. A., Krichmar, J. L., Scorcioni, R., Nasuto, S. J., Senft, S. L. and Krichmar, G. L. (2001). Computer generation and quantitative morphometric analysis of virtual neurons. Anat. Embryol. 204 283–301.
  • [5] Aydin, B., Pataki, G., Wang, H., Bullitt, E. and Marron, J. S. (2009). A principal component analysis for trees. Ann. Appl. Stat. 3 1597–1615.
  • [6] Banks, D. and Constantine, G. M. (1998). Metric models for random graphs. J. Classification 15 199–223.
  • [7] Billera, L. J., Holmes, S. P. and Vogtmann, K. (2001). Geometry of the space of phylogenetic trees. Adv. in Appl. Math. 27 733–767.
  • [8] Canale, A. and Dunson, D. B. (2011). Bayesian kernel mixtures for counts. J. Amer. Statist. Assoc. 106 1528–1539.
  • [9] Chang, H.-W., Iyer, H., Bullitt, E. and Wang, H. (2013). Generalized linear mixed models for branching probabilities of brain artery systems. Model Assist. Stat. Appl. 8 121–133.
  • [10] Cline, H. and Haas, K. (2008). The regulation of dendritic arbor development and plasticity by glutamatergic synaptic input: A review of the synaptotrophic hypothesis. J. Physiol. 586 1509–1517.
  • [11] Coello Coello, C. A., Lamont, G. B. and Van Veldhuizen, D. A. (2007). Evolutionary Algorithms for Solving Multi-Objective Problems, 2nd ed. Springer, New York. With a foreword by David E. Goldberg.
  • [12] Flajolet, P., Gao, Z., Odlyzko, A. and Richmond, B. (1993). The distribution of heights of binary trees and other simple trees. Combin. Probab. Comput. 2 145–156.
  • [13] Fréchet, M. (1948). Les éléments aléatoires de nature quelconque dans un espace distancié. Ann. Inst. H. Poincaré 10 215–310.
  • [14] Goldberg, D. (1989). Genetic Algorithms in Optimization, Search and Machine Learning. Addison Wesley Publishing Company, New York.
  • [15] Grutzendler, J., Helmin, K., Tsai, J. and Gan, W.-B. (2007). Various dendritic abnormalities are associated with fibrillar amyloid deposits in Alzheimer’s disease. Ann. N.Y. Acad. Sci. 1097 30–39.
  • [16] Harris, T. E. (1952). First passage and recurrence distributions. Trans. Amer. Math. Soc. 73 471–486.
  • [17] Hendrickson, P. J., Gene, J. Y., Song, D. and Berger, T. W. (2016). A million-plus neuron model of the hippocampal dentate gyrus: Critical role for topography in determining spatiotemporal network dynamics. IEEE Trans. Biomed. Eng. 63 199–209.
  • [18] Johnston, D. and Wu, S. M.-S. (1994). Foundations of Cellular Neurophysiology. MIT Press, Cambridge.
  • [19] Koenker, R. and Hallock, K. (2001). Quantile regression. J. Econ. Perspect. 15 143–156.
  • [20] Liu, R. Y. (1990). On a notion of data depth based on random simplices. Ann. Statist. 18 405–414.
  • [21] Mäkinen, E. (1999). Generating random binary trees—a survey. Inform. Sci. 115 123–136.
  • [22] Marron, J. S. and Alonso, A. M. (2014). Overview of object oriented data analysis. Biom. J. 56 732–753.
  • [23] Migliore, M. and Shepherd, G. M. (2005). An integrated approach to classifying neuronal phenotypes. Nat. Rev., Neurosci. 6 810–818.
  • [24] Padurariu, M., Ciobica, A., Mavroudis, I., Fotiou, D. and Baloyannis, S. (2012). Hippocampal neuronal loss in the CA1 and CA3 areas of Alzheimer’s disease patients. Psychiatr. Danub. 24 152–158.
  • [25] Phillips, C. and Warnow, T. J. (1996). The asymmetric median tree—a new model for building consensus trees. Discrete Appl. Math. 71 311–335.
  • [26] Pitman, J. (2006). Combinatorial Stochastic Processes: Lectures from the 32nd Summer School on Probability Theory Held in Saint-Flour, July 724, 2002. Lecture Notes in Math. 1875. Springer, Berlin. With a foreword by Jean Picard.
  • [27] Pyapali, G. K., Sik, A., Penttonen, M., Buzsaki, G. and Turner, D. A. (1998). Dendritic properties of hippocampal CA1 pyramidal neurons in the rat: Intracellular staining in vivo and in vitro. J. Comp. Neurol. 391 335–352.
  • [28] Pyapali, G. K. and Turner, D. A. (1994). Denervation-induced dendritic alterations in CA1 pyramidal cells following kainic acid hippocampal lesions in rats. Brain Res. 652 279–290.
  • [29] Pyapali, G. K. and Turner, D. A. (1996). Increased dendritic extent in hippocampal CA1 neurons from aged F344 rats. Neurobiol. Aging 17 601–611.
  • [30] Serfling, R. (2002). Quantile functions for multivariate analysis: Approaches and applications. Stat. Neerl. 56 214–232. Special issue: Frontier Research in Theoretical Statistics, 2000 (Eindhoven).
  • [31] Shen, D., Shen, H., Bhamidi, S., Muñoz Maldonado, Y., Kim, Y. and Marron, J. S. (2014). Functional data analysis of tree data objects. J. Comput. Graph. Statist. 23 418–438.
  • [32] Sienkiewicz, E. (2015). Analysis of big data and structured data with application in neuroscience. Ph.D. thesis, Colorado State Univ.
  • [33] Sienkiewicz, E. and Wang, H. (2017). Supplement to “Pareto quantiles of unlabeled tree objects.” DOI:10.1214/17-AOS1593SUPP.
  • [34] Šimić, G., Kostović, I., Winblad, B. and Bogdanović, N. (1997). Volume and number of neurons of the human hippocampal formation in normal aging and Alzheimer’s disease. J. Comp. Neurol. 379 482–494.
  • [35] Sivanandam, S. N. and Deepa, S. N. (2008). Introduction to Genetic Algorithms. Springer, Berlin.
  • [36] Song, D., Chan, R. H. M., Marmarelis, V. Z., Hampson, R. E., Deadwyler, S. A. and Berger, T. W. (2007). Nonlinear dynamic modeling of spike train transformations for hippocampal-cortical prostheses. IEEE Trans. Biomed. Eng. 54 1053–1066.
  • [37] Vida, I. (2010). Morphology of hippocampal neurons. In Hippocampal Microcircuits 27–67. Springer, Berlin.
  • [38] Walter, S. (2011). Defining quantiles for functional data. Ph.D. thesis, The Univ. Melbourne.
  • [39] Wang, H. and Marron, J. S. (2007). Object oriented data analysis: Sets of trees. Ann. Statist. 35 1849–1873.
  • [40] Wang, Y., Marron, J. S., Aydin, B., Ladha, A., Bullitt, E. and Wang, H. (2012). A nonparametric regression model with tree-structured response. J. Amer. Statist. Assoc. 107 1272–1285.
  • [41] West, M. J., Coleman, P. D., Flood, D. G. and Troncoso, J. C. (1994). Differences in the pattern of hippocampal neuronal loss in normal ageing and Alzheimer’s disease. Lancet 344 769–772.

Supplemental materials

  • Supplement to “Pareto quantiles of unlabeled tree objects”. This document includes the description of the data object construction, proofs, and additional details regarding simulation and data analysis.