Open Access
2019 Quantile-based clustering
Christian Hennig, Cinzia Viroli, Laura Anderlucci
Electron. J. Statist. 13(2): 4849-4883 (2019). DOI: 10.1214/19-EJS1640

Abstract

A new cluster analysis method, $K$-quantiles clustering, is introduced. $K$-quantiles clustering can be computed by a simple greedy algorithm in the style of the classical Lloyd’s algorithm for $K$-means. It can be applied to large and high-dimensional datasets. It allows for within-cluster skewness and internal variable scaling based on within-cluster variation. Different versions allow for different levels of parsimony and computational efficiency. Although $K$-quantiles clustering is conceived as nonparametric, it can be connected to a fixed partition model of generalized asymmetric Laplace-distributions. The consistency of $K$-quantiles clustering is proved, and it is shown that $K$-quantiles clusters correspond to well separated mixture components in a nonparametric mixture. In a simulation, $K$-quantiles clustering is compared with a number of popular clustering methods with good results. A high-dimensional microarray dataset is clustered by $K$-quantiles.

Citation

Download Citation

Christian Hennig. Cinzia Viroli. Laura Anderlucci. "Quantile-based clustering." Electron. J. Statist. 13 (2) 4849 - 4883, 2019. https://doi.org/10.1214/19-EJS1640

Information

Received: 1 April 2019; Published: 2019
First available in Project Euclid: 4 December 2019

zbMATH: 07147366
MathSciNet: MR4038727
Digital Object Identifier: 10.1214/19-EJS1640

Keywords: Fixed partition model , high dimensional clustering , nonparametric mixture , quantile discrepancy

Vol.13 • No. 2 • 2019
Back to Top