Reconciling design-based and model-based causal inferences for split-plot experiments

Anqi Zhao; Peng Ding

doi:10.1214/21-AOS2144

Abstract

The split-plot design arose from agricultural science with experimental units, also known as the subplots, nested within groups known as the whole plots. It assigns different interventions at the whole-plot and subplot levels, respectively, providing a convenient way to accommodate hard-to-change factors. By design, subplots within the same whole plot receive the same level of the whole-plot intervention, and thereby induce a group structure on the final treatment assignments. A common strategy is to run an ordinary least squares (ols) regression of the outcome on the treatment indicators coupled with the robust standard errors clustered at the whole-plot level. It does not give consistent estimators for the treatment effects of interest when the whole-plot sizes vary. Another common strategy is to fit a linear mixed-effects model of the outcome with normal random effects and errors. It is a purely model-based approach and can be sensitive to violations of the parametric assumptions. In contrast, design-based inference assumes no outcome models and relies solely on the controllable randomization mechanism determined by the physical experiment. We first extend the existing design-based inference based on the Horvitz–Thompson estimator to the Hajek estimator, and establish the finite-population central limit theorem for both under split-plot randomization. We then reconcile the results with those under the model-based approach, and propose two regression strategies, namely (i) the weighted least squares (wls) fit of the unit-level data based on the inverse probability weighting and (ii) the ols fit of the aggregate data based on whole-plot total outcomes, to reproduce the Hajek and Horvitz–Thompson estimators, respectively. This, together with the asymptotic conservativeness of the corresponding cluster-robust covariances for estimating the true design-based covariances as we establish in the process, justifies the validity of the regression estimators for design-based inference. In light of the flexibility of regression formulation for covariate adjustment, we further extend the theory to the case with covariates, and demonstrate the efficiency gain by regression-based covariate adjustment via both asymptotic theory and simulation. Importantly, all our theories are either numeric or design-based, and hold regardless of how well the regression equations represent the true data generating process.

Funding Statement

Zhao was funded by the Start-Up grant R-155-000-216-133 from the National University of Singapore. Ding was partially funded by the U.S. National Science Foundation (Grant 1945136).

Acknowledgments

We thank the Associate Editor, two referees, Zhichao Jiang and Rahul Mukerjee for most insightful comments.

Citation

Download Citation

Anqi Zhao. Peng Ding. "Reconciling design-based and model-based causal inferences for split-plot experiments." Ann. Statist. 50 (2) 1170 - 1192, April 2022. https://doi.org/10.1214/21-AOS2144

Information

Received: 1 April 2021; Revised: 1 October 2021; Published: April 2022

First available in Project Euclid: 7 April 2022

MathSciNet: MR4404932

zbMATH: 1486.62230

Digital Object Identifier: 10.1214/21-AOS2144

Subjects:

Primary: 62G05 , 62J05 , 62K15

Keywords: Cluster randomization , cluster-robust standard error , covariate adjustment , inverse probability weighting , potential outcome , Randomization inference

Abstract

Funding Statement

Acknowledgments

Citation

Information

KEYWORDS/PHRASES

PUBLICATION TITLE:

PUBLICATION YEARS