Open Access
March, 1992 An Optimal Variable Cell Histogram Based on the Sample Spacings
Yuichiro Kanazawa
Ann. Statist. 20(1): 291-304 (March, 1992). DOI: 10.1214/aos/1176348523


Suppose we wish to construct a variable $k$-cell histogram based on an independent identically distributed sample of size $n - 1$ from an unknown density $f$ on the interval of finite length. A variable cell histogram requires cutpoints and heights of all of its cells to be specified. We propose the following procedure: (i) choose from the order statistics corresponding to the sample a set of $k + 1$ cutpoints that maximize a criterion, a function of the sample spacings; (ii) compute heights of the $k$ cells according to a formula. The resulting histogram estimates a $k$-cell theoretical histogram that stays constant within a cell and that minimizes the Hellinger distance to the density $f$. The histogram tends to estimate low density regions accurately and is easy to compute. We find the number of cells of order $n^{1/3}$ minimizes the mean Hellinger distance between the density $f$ and a class of histograms whose cutpoints are chosen from the order statistics.


Download Citation

Yuichiro Kanazawa. "An Optimal Variable Cell Histogram Based on the Sample Spacings." Ann. Statist. 20 (1) 291 - 304, March, 1992.


Published: March, 1992
First available in Project Euclid: 12 April 2007

zbMATH: 0745.62034
MathSciNet: MR1150345
Digital Object Identifier: 10.1214/aos/1176348523

Primary: 62G05
Secondary: 62E20

Keywords: Density estimation , Hellinger distance , Histogram , order statistics , spacing

Rights: Copyright © 1992 Institute of Mathematical Statistics

Vol.20 • No. 1 • March, 1992
Back to Top