The Annals of Statistics

Theoretical analysis of nonparametric filament estimation

Wanli Qiao and Wolfgang Polonik

Full-text: Open access

Abstract

This paper provides a rigorous study of the nonparametric estimation of filaments or ridge lines of a probability density $f$. Points on the filament are considered as local extrema of the density when traversing the support of $f$ along the integral curve driven by the vector field of second eigenvectors of the Hessian of $f$. We “parametrize” points on the filaments by such integral curves, and thus both the estimation of integral curves and of filaments will be considered via a plug-in method using kernel density estimation. We establish rates of convergence and asymptotic distribution results for the estimation of both the integral curves and the filaments. The main theoretical result establishes the asymptotic distribution of the uniform deviation of the estimated filament from its theoretical counterpart. This result utilizes the extreme value behavior of nonstationary Gaussian processes indexed by manifolds $M_{h},h\in(0,1]$ as $h\to0$.

Article information

Source
Ann. Statist., Volume 44, Number 3 (2016), 1269-1297.

Dates
Received: May 2014
Revised: October 2015
First available in Project Euclid: 11 April 2016

Permanent link to this document
https://projecteuclid.org/euclid.aos/1460381693

Digital Object Identifier
doi:10.1214/15-AOS1405

Mathematical Reviews number (MathSciNet)
MR3485960

Zentralblatt MATH identifier
1338.62139

Subjects
Primary: 62G20: Asymptotic properties
Secondary: 62G05: Estimation

Keywords
Extreme value distribution nonparametric curve estimation integral curves kernel density estimation

Citation

Qiao, Wanli; Polonik, Wolfgang. Theoretical analysis of nonparametric filament estimation. Ann. Statist. 44 (2016), no. 3, 1269--1297. doi:10.1214/15-AOS1405. https://projecteuclid.org/euclid.aos/1460381693


Export citation

References

  • Arias-Castro, E., Donoho, D. L. and Huo, X. (2006). Adaptive multiscale detection of filamentary structures in a background of uniform random points. Ann. Statist. 34 326–349.
  • Barrow, J. D., Sonoda, D. H. and Bhavsar, S. P. (1985). Minimal spanning tree, filaments and galaxy clustering. Mon. Not. R. Astron. Soc. 216 17–35.
  • Bharadwaj, S., Bhavsar, S. P. and Sheth, J. V. (2004). The size of the longest filaments in the universe. Astrophys. J. 606 25–31.
  • Bickel, P. J. and Rosenblatt, M. (1973). On some global measures of the deviations of density function estimates. Ann. Statist. 1 1071–1095.
  • Chen, Y.-C., Genovese, C. R. and Wasserman, L. (2013). Uncertainty measures and limiting distributions for filament estimation. Preprint. Available at arXiv:1312.2098v1.
  • Chen, Y.-C., Genovese, C. R. and Wasserman, L. (2014). Generalized mode and ridge estimation. Preprint. Available at arXiv:1406.1803.
  • Chen, Y.-C., Genovese, C. R. and Wasserman, L. (2015). Asymptotic theory for density ridges Ann. Statist. 43 1896–1928.
  • Cheng, Y. (1995). Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17 790–799.
  • Comaniciu, D. and Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24 603–619.
  • Dietrich, J. P., Werner, N., Clowe, D., Finoguenov, A., Kitching, T., Miller, L. and Simionescu, A. (2012). A filament of dark matter between two clusters of galaxies. Nature 487 202–204.
  • Eberly, D. (1996). Ridges in Image and Data Analysis. Kluwer, Boston, MA.
  • Einmahl, U. and Mason, D. M. (2005). Uniform in bandwidth consistency of kernel-type function estimators. Ann. Statist. 33 1380–1403.
  • Fukunaga, K. and Hostetler, L. D. (1975). The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inform. Theory 21 32–40.
  • Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2009). On the path density of a gradient field. Ann. Statist. 37 3236–3271.
  • Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2012a). The geometry of nonparametric filament estimation. J. Amer. Statist. Assoc. 107 788–799.
  • Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2012b). Manifold estimation and singular deconvolution under Hausdorff loss. Ann. Statist. 40 941–963.
  • Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2014). Nonparametric ridge estimation. Ann. Statist. 42 1511–1545.
  • Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2016). Non-parametric inference for density modes. J. R. Stat. Soc. Ser. B. 78 99–126.
  • Giné, E. and Guillou, A. (2002). Rates of strong uniform consistency for multivariate kernel density estimators. Ann. Inst. Henri Poincaré Probab. Stat. 38 907–921.
  • Gronwall, T. H. (1919). Note on the derivatives with respect to a parameter of the solutions of a system of differential equations. Ann. of Math. (2) 20 292–296.
  • Hall, P., Qian, W. and Titterington, D. M. (1992). Ridge finding from noisy data. J. Comput. Graph. Statist. 1 197–211.
  • Hastie, T. and Stuetzle, W. (1989). Principal curves. J. Amer. Statist. Assoc. 84 502–516.
  • Koltchinskii, V., Sakhanenko, L. and Cai, S. (2007). Integral curves of noisy vector fields and statistical problems in diffusion tensor imaging: Nonparametric kernel estimation and hypotheses testing. Ann. Statist. 35 1576–1607.
  • Mikhaleva, T. L. and Piterbarg, V. I. (1996). On the distribution of the maximum of a Gaussian field with constant variance on a smooth manifold. Theory Probab. Appl. 41 367–379.
  • Novikov, D., Colombi, S. and Doré, O. (2006). Skeleton as a probe of the cosmic web: Two-dimensional case. Mon. Not. R. Astron. Soc. 366 1201–1216.
  • Ozertem, U. and Erdogmus, D. (2011). Locally defined principal curves and surfaces. J. Mach. Learn. Res. 12 1249–1286.
  • Pimbblet, K. A., Drinkwater, M. J. and Hawkrigg, M. C. (2004). Inter-cluster filaments of galaxies programme: Abundance and distribution of filaments in the 2dFGRS catalogue. Mon. Not. R. Astron. Soc. 354 L61–L65.
  • Piterbarg, V. and Stamatovich, S. (2001). On maximum of Gaussian non-centered fields indexed on smooth manifolds. In Asymptotic Methods in Probability and Statistics with Applications (St. Petersburg, 1998) (N. Balakrishnan, I. A. Ibragimov, and V. B. Nevzorov, eds.) 189–203. Birkhäuser, Boston, MA.
  • Qiao, W. (2013). On estimation of filamentary structures. Ph.D. thesis, Univ. California, Davis.
  • Qiao, W. and Polonik, W. (2015). Extrema of locally stationary Gaussian fields on growing manifolds. Preprint. Available at arXiv:1510.06833.
  • Qiao, W. and Polonik, W. (2016). Supplement to “Theoretical analysis of nonparametric filament estimation.” DOI:10.1214/15-AOS1405SUPP.
  • Rosenblatt, M. (1976). On the maximal deviation of $k$-dimensional density estimates. Ann. Probab. 4 1009–1015.
  • van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York.

Supplemental materials

  • Supplement to “Theoretical analysis of nonparametric filament estimation”. Due to page constraints on the main article, this supplement presents the proofs of some technical results in this paper as well as some miscellaneous results (Appendix B) that are used in the proofs.