Open Access
October 2015 Fully adaptive density-based clustering
Ingo Steinwart
Ann. Statist. 43(5): 2132-2167 (October 2015). DOI: 10.1214/15-AOS1331

Abstract

The clusters of a distribution are often defined by the connected components of a density level set. However, this definition depends on the user-specified level. We address this issue by proposing a simple, generic algorithm, which uses an almost arbitrary level set estimator to estimate the smallest level at which there are more than one connected components. In the case where this algorithm is fed with histogram-based level set estimates, we provide a finite sample analysis, which is then used to show that the algorithm consistently estimates both the smallest level and the corresponding connected components. We further establish rates of convergence for the two estimation problems, and last but not least, we present a simple, yet adaptive strategy for determining the width-parameter of the involved density estimator in a data-depending way.

Citation

Download Citation

Ingo Steinwart. "Fully adaptive density-based clustering." Ann. Statist. 43 (5) 2132 - 2167, October 2015. https://doi.org/10.1214/15-AOS1331

Information

Received: 1 March 2015; Revised: 1 March 2015; Published: October 2015
First available in Project Euclid: 16 September 2015

zbMATH: 1327.62382
MathSciNet: MR3396981
Digital Object Identifier: 10.1214/15-AOS1331

Subjects:
Primary: 62H30 , 91C20
Secondary: 62G07

Keywords: Adaptivity , cluster analysis , consistency , rates

Rights: Copyright © 2015 Institute of Mathematical Statistics

Vol.43 • No. 5 • October 2015
Back to Top