Open Access
September 2012 Bootstrapping data arrays of arbitrary order
Art B. Owen, Dean Eckles
Ann. Appl. Stat. 6(3): 895-927 (September 2012). DOI: 10.1214/12-AOAS547


In this paper we study a bootstrap strategy for estimating the variance of a mean taken over large multifactor crossed random effects data sets. We apply bootstrap reweighting independently to the levels of each factor, giving each observation the product of independently sampled factor weights. No exact bootstrap exists for this problem [McCullagh (2000) Bernoulli 6 285–301]. We show that the proposed bootstrap is mildly conservative, meaning biased toward overestimating the variance, under sufficient conditions that allow very unbalanced and heteroscedastic inputs. Earlier results for a resampling bootstrap only apply to two factors and use multinomial weights that are poorly suited to online computation. The proposed reweighting approach can be implemented in parallel and online settings. The results for this method apply to any number of factors. The method is illustrated using a $3$ factor data set of comment lengths from Facebook.


Download Citation

Art B. Owen. Dean Eckles. "Bootstrapping data arrays of arbitrary order." Ann. Appl. Stat. 6 (3) 895 - 927, September 2012.


Published: September 2012
First available in Project Euclid: 31 August 2012

zbMATH: 06096515
MathSciNet: MR3012514
Digital Object Identifier: 10.1214/12-AOAS547

Keywords: Bayesian pigeonhole bootstrap , online bagging , online bootstrap , relational data , tensor data , unbalanced random effects

Rights: Copyright © 2012 Institute of Mathematical Statistics

Vol.6 • No. 3 • September 2012
Back to Top