June 2024 Selecting invalid instruments to improve Mendelian randomization with two-sample summary data
Ashish Patel, Francis J. DiTraglia, Verena Zuber, Stephen Burgess
Author Affiliations +
Ann. Appl. Stat. 18(2): 1729-1749 (June 2024). DOI: 10.1214/23-AOAS1856


Mendelian randomization (MR) is a widely-used method to estimate the causal relationship between a risk factor and disease. A fundamental part of any MR analysis is to choose appropriate genetic variants as instrumental variables. Genome-wide association studies often reveal that hundreds of genetic variants may be robustly associated with a risk factor, but in some situations investigators may have greater confidence in the instrument validity of only a smaller subset of variants. Nevertheless, the use of additional instruments may be optimal from the perspective of mean squared error, even if they are slightly invalid; a small bias in estimation may be a price worth paying for a larger reduction in variance. For this purpose we consider a method for “focused” instrument selection whereby genetic variants are selected to minimise the estimated asymptotic mean squared error of causal effect estimates. In a setting of many weak and locally invalid instruments, we propose a novel strategy to construct confidence intervals for postselection focused estimators that guards against the worst case loss in asymptotic coverage. In empirical applications to: (i) validate lipid drug targets and (ii) investigate vitamin D effects on a wide range of outcomes, our findings suggest that the optimal selection of instruments does not involve only a small number of biologically-justified instruments but also many potentially invalid instruments.

Funding Statement

SB was supported by a Sir Henry Dale Fellowship jointly funded by the Wellcome Trust and the Royal Society (204623/Z/16/Z).
VZ was supported by the United Kingdom Research and Innovation Medical Research Council (MR/W029790/1).
This research was funded by the United Kingdom Research and Innovation Medical Research Council (MC-UU-00002/7) and supported by the National Institute for Health Research Cambridge Biomedical Research Centre: BRC-1215-20014.


We thank participants at the 2022 International Society for Clinical Biostatistics conference, participants at the 2023 Siena Workshop on Econometric Theory and Applications, and Dipender Gill for helpful discussions. We thank an anonymous referee for detailed comments.


Download Citation

Ashish Patel. Francis J. DiTraglia. Verena Zuber. Stephen Burgess. "Selecting invalid instruments to improve Mendelian randomization with two-sample summary data." Ann. Appl. Stat. 18 (2) 1729 - 1749, June 2024. https://doi.org/10.1214/23-AOAS1856


Received: 1 April 2023; Revised: 1 September 2023; Published: June 2024
First available in Project Euclid: 5 April 2024

Digital Object Identifier: 10.1214/23-AOAS1856

Keywords: focused information criterion , Mendelian randomization , postselection inference

Rights: This research was funded, in whole or in part, by Wellcome Trust, 204623/Z/16/Z. A CC BY 4.0 license is applied to this article arising from this submission, in accordance with the grant’s open access conditions.


This article is only available to subscribers.
It is not available for individual sale.

Vol.18 • No. 2 • June 2024
Back to Top