Open Access
December 2009 Using the bootstrap to quantify the authority of an empirical ranking
Peter Hall, Hugh Miller
Ann. Statist. 37(6B): 3929-3959 (December 2009). DOI: 10.1214/09-AOS699


The bootstrap is a popular and convenient method for quantifying the authority of an empirical ordering of attributes, for example of a ranking of the performance of institutions or of the influence of genes on a response variable. In the first of these examples, the number, p, of quantities being ordered is sometimes only moderate in size; in the second it can be very large, often much greater than sample size. However, we show that in both types of problem the conventional bootstrap can produce inconsistency. Moreover, the standard n-out-of-n bootstrap estimator of the distribution of an empirical rank may not converge in the usual sense; the estimator may converge in distribution, but not in probability. Nevertheless, in many cases the bootstrap correctly identifies the support of the asymptotic distribution of ranks. In some contemporary problems, bootstrap prediction intervals for ranks are particularly long, and in this context, we also quantify the accuracy of bootstrap methods, showing that the standard bootstrap gets the order of magnitude of the interval right, but not the constant multiplier of interval length. The m-out-of-n bootstrap can improve performance and produce statistical consistency, but it requires empirical choice of m; we suggest a tuning solution to this problem. We show that in genomic examples, where it might be expected that the standard, “synchronous” bootstrap will successfully accommodate nonindependence of vector components, that approach can produce misleading results. An “independent component” bootstrap can overcome these difficulties, even in cases where components are not strictly independent.


Download Citation

Peter Hall. Hugh Miller. "Using the bootstrap to quantify the authority of an empirical ranking." Ann. Statist. 37 (6B) 3929 - 3959, December 2009.


Published: December 2009
First available in Project Euclid: 23 October 2009

zbMATH: 1191.62080
MathSciNet: MR2572448
Digital Object Identifier: 10.1214/09-AOS699

Primary: 62G09
Secondary: 62G30

Keywords: Confidence interval , genomic data , high dimension , independent-component bootstrap , m-out-of-n bootstrap , ordering , overlap interval , prediction interval , synchronous bootstrap

Rights: Copyright © 2009 Institute of Mathematical Statistics

Vol.37 • No. 6B • December 2009
Back to Top