Registered users receive a variety of benefits including the ability to customize email alerts, create favorite journals list, and save searches.
Please note that a Project Euclid web account does not automatically grant access to full-text content. An institutional or society member subscription is required to view non-Open Access content.
Contact firstname.lastname@example.org with any questions.
Response-Adaptive Randomization (RAR) is part of a wider class of data-dependent sampling algorithms, for which clinical trials are typically used as a motivating application. In that context, patient allocation to treatments is determined by randomization probabilities that change based on the accrued response data in order to achieve experimental goals. RAR has received abundant theoretical attention from the biostatistical literature since the 1930s and has been the subject of numerous debates. In the last decade, it has received renewed consideration from the applied and methodological communities, driven by well-known practical examples and its widespread use in machine learning. Papers on the subject present different views on its usefulness, and these are not easy to reconcile. This work aims to address this gap by providing a broad, balanced and fresh review of methodological and practical issues to consider when debating the use of RAR in clinical trials.
Group sequential Phase III trial designs enable early stopping for positive or negative study outcomes. Response-adaptive randomisation can be included in such designs with the sampling ratio in each group of subjects determined by the current treatment effect estimate. We demonstrate the potential of adaptive randomisation to reduce the number of patients receiving the inferior treatment, even when there is a delay in observing each patient’s response. We also observe that using a fixed but unequal sampling ratio may offer a simpler way to achieve the same objectives.
The paper by Robertson et al. intends to provide “a unified, broad and fresh review of methodological and practical issues to consider” as a contribution to the ongoing debate concerning RAR in clinical trials. Simulations carried out by different authors seem to disprove its usefulness both for statistical inference and as a safeguard for the care of patients in the trial. I argue that the arguments brought forward so far are inconclusive, since the inferential considerations are sometimes incomplete or incorrect, and some simulation studies unconvincing. A Bayesian stand is very common, but often not fully understood.
Robertson et al. provide an elegant, thorough and accessible overview of the rich theoretical background of response adaptive randomization. Many completed and ongoing clinical trials utilize these methods with demonstrated success. We provide a summary of multiple real world examples of response adaptive randomization, and a discussion of themes that arise in planning and executing response adaptive trials.
Data augmentation improves the convergence of iterative algorithms, such as the EM algorithm and Gibbs sampler by introducing carefully designed latent variables. In this article, we first propose a data augmentation scheme for the first-order autoregression plus noise model, where optimal values of working parameters introduced for recentering and rescaling of the latent states, can be derived analytically by minimizing the fraction of missing information in the EM algorithm. The proposed data augmentation scheme is then utilized to design efficient Markov chain Monte Carlo (MCMC) algorithms for Bayesian inference of some non-Gaussian and nonlinear state space models, via a mixture of normals approximation coupled with a block-specific reparametrization strategy. Applications on simulated and benchmark real data sets indicate that the proposed MCMC sampler can yield improvements in simulation efficiency compared with centering, noncentering and even the ancillarity-sufficiency interweaving strategy.
Gaussian process (GP) regression is computationally expensive in spatial applications involving massive data. Various methods address this limitation, including a small number of Bayesian methods based on distributed computations (or the divide-and-conquer strategy). Focusing on the latter literature, we achieve three main goals. First, we develop an extensible Bayesian framework for distributed spatial GP regression that embeds many popular methods. The proposed framework has three steps that partition the entire data into many subsets, apply a readily available Bayesian spatial process model in parallel on all the subsets, and combine the posterior distributions estimated on all the subsets into a pseudo posterior distribution that conditions on the entire data. The combined pseudo posterior distribution replaces the full data posterior distribution in prediction and inference problems. Demonstrating our framework’s generality, we extend posterior computations for (nondistributed) spatial process models with a stationary full-rank and a nonstationary low-rank GP priors to the distributed setting. Second, we contrast the empirical performance of popular distributed approaches with some widely-used, nondistributed alternatives and highlight their relative advantages and shortcomings. Third, we provide theoretical support for our numerical observations and show that the Bayes -risks of the combined posterior distributions obtained from a subclass of the divide-and-conquer methods achieves the near-optimal convergence rate in estimating the true spatial surface with various types of covariance functions. Additionally, we provide upper bounds on the number of subsets to achieve these near-optimal rates.
I. J. (“Jack”) Good was a leading Bayesian statistician for more than half a century after World War II, playing an important role in the post-war Bayesian revival. But his graduate training had been in pure mathematics rather than statistics (one of his doctoral advisors at Cambridge had been the famous G. H. Hardy). What was responsible for this metamorphosis from pure mathematician to applied and theoretical statistician? As Good himself only revealed in 1976, during the war he had initially served as an assistant to Alan Turing at Bletchley Park, working on the cryptanalysis of the German Naval Enigma, and it was from Turing that he acquired his life-long Bayesian philosophy. Declassified and other documents now permit us to understand in some detail how this came about, and indeed how many of the ideas Good explored and papers he wrote in the initial decades after the war, in fact, gave in sanitized form, results that had their origins in his wartime work. Drawing on these sources, this paper discusses the daily and very real use of Bayesian methods Turing and Good employed, and how this was gradually revealed by Good over the course of his life (including his return to classified work in the 1950s).
In cancer research, clustering techniques are widely used for exploratory analyses, playing a critical role in the identification of novel cancer subtypes and patient management. As data collected by multiple research groups grows, it is increasingly feasible to investigate the replicability of clustering procedures, that is, their ability to consistently recover biologically meaningful clusters across several data sets. In this paper, we review methods for replicability of clustering analyses, and discuss a novel framework for evaluating cross-study clustering replicability, useful when two or more studies are available. Our approach can be applied to any clustering algorithm and can employ different measures of similarity between partitions to quantify replicability, globally (i.e., for the whole sample) as well as locally (i.e., for individual clusters). Using experiments on synthetic and real gene expression data, we illustrate the usefulness of our procedure to evaluate if the same clusters are identified consistently across a collection of data sets.
Using the concept of principal stratification from the causal inference literature, we introduce a new notion of fairness, called principal fairness, for human and algorithmic decision-making. Principal fairness states that one should not discriminate among individuals who would be similarly affected by the decision. Unlike the existing statistical definitions of fairness, principal fairness explicitly accounts for the fact that individuals can be impacted by the decision. This causal fairness formulation also enables online or post-hoc fairness evaluation and policy learning. We also explain how principal fairness relates to the existing causality-based fairness criteria. In contrast to the counterfactual fairness criteria, for example, principal fairness considers the effects of decision in question rather than those of protected attributes of interest. Finally, we discuss how to conduct empirical evaluation and policy learning under the proposed principal fairness criterion.
We discuss systematically two versions of confidence regions: those based on p-values and those based on e-values, a recent alternative to p-values. Both versions can be applied to multiple hypothesis testing, and in this paper we are interested in procedures that control the number of false discoveries under arbitrary dependence between the base p- or e-values. We introduce a procedure that is based on e-values and show that it is efficient both computationally and statistically using simulated and real-world data sets. Comparison with the corresponding standard procedure based on p-values is not straightforward, but there are indications that the new one performs significantly better in some situations.
Gábor J. Székely was born in Budapest, Hungary on February 4, 1947. He graduated from Eötvös Loránd University (ELTE) with an M.S. degree in 1970, and a Ph.D. degree in 1971. He received his Candidate Degree from the Hungarian Academy of Sciences in 1976, and the Doctor of Science Degree (D. Sc.) from the Hungarian Academy of Sciences in 1986. Székely joined the Department of Probability Theory of ELTE in 1970. In 1989, he became the founding chair of the Department of Stochastics of the Budapest Institute of Technology (Technical University of Budapest). In 1995, he moved to the United States as a tenured full professor at Bowling Green State University (BGSU) in Bowling Green, Ohio. Before that, in 1990–1991, he was the first Lukacs Distinguished Professor at BGSU. Székely had several visiting positions, for example, at the University of Amsterdam in 1977 and at Yale University in 1989. Between 2006 and 2022, he served as a Program Director in the Statistics Program of the Division of Mathematical Sciences at the U.S. National Science Foundation. Székely has about 250 publications, including 6 books in several languages. In 1988, he received the Rollo Davidson Prize from Cambridge University, jointly with Imre Z. Ruzsa for their work on algebraic probability theory. In 2010, Székely became an Elected Fellow of the Institute of Mathematical Statistics mostly for his works dealing with physics concepts in statistics like energy statistics and distance correlation. He had the fortune to know and work with world-class mathematicians and statisticians like (in chronological order of their first meetings): P. Erdős, A. Rényi, Y. Linnik, B. de Finetti, A. N. Kolmogorov, H. Robbins, G. Pólya, L. Shepp, G. Wahba, C. R. Rao, B. Efron, P. Bickel and E. Seneta.