Registered users receive a variety of benefits including the ability to customize email alerts, create favorite journals list, and save searches.
Please note that a Project Euclid web account does not automatically grant access to full-text content. An institutional or society member subscription is required to view non-Open Access content.
Contact email@example.com with any questions.
In recent years, a variety of extensions and refinements have been developed for data augmentation based model fitting routines. These developments aim to extend the application, improve the speed and/or simplify the implementation of data augmentation methods, such as the deterministic EM algorithm for mode finding and stochastic Gibbs sampler and other auxiliary-variable based methods for posterior sampling. In this overview article we graphically illustrate and compare a number of these extensions, all of which aim to maintain the simplicity and computation stability of their predecessors. We particularly emphasize the usefulness of identifying similarities between the deterministic and stochastic counterparts as we seek more efficient computational strategies. We also demonstrate the applicability of data augmentation methods for handling complex models with highly hierarchical structure, using a high-energy high-resolution spectral imaging model for data from satellite telescopes, such as the Chandra X-ray Observatory.
The popularity of the EM algorithm owes much to the 1977 paper by Dempster, Laird and Rubin. That paper gave the algorithm its name, identified the general form and some key properties of the algorithm and established its broad applicability in scientific research. This review gives a nontechnical introduction to the algorithm for a general scientific audience, and presents a few examples characteristic of its application.
EM algorithm is a convenient tool for maximum likelihood model fitting when the data are incomplete or when there are latent variables or hidden states. In this review article we explain that EM algorithm is a natural computational scheme for learning image templates of object categories where the learning is not fully supervised. We represent an image template by an active basis model, which is a linear composition of a selected set of localized, elongated and oriented wavelet elements that are allowed to slightly perturb their locations and orientations to account for the deformations of object shapes. The model can be easily learned when the objects in the training images are of the same pose, and appear at the same location and scale. This is often called supervised learning. In the situation where the objects may appear at different unknown locations, orientations and scales in the training images, we have to incorporate the unknown locations, orientations and scales as latent variables into the image generation process, and learn the template by EM-type algorithms. The E-step imputes the unknown locations, orientations and scales based on the currently learned template. This step can be considered self-supervision, which involves using the current template to recognize the objects in the training images. The M-step then relearns the template based on the imputed locations, orientations and scales, and this is essentially the same as supervised learning. So the EM learning process iterates between recognition and supervised learning. We illustrate this scheme by several experiments.
In the past decade computational biology has grown from a cottage industry with a handful of researchers to an attractive interdisciplinary field, catching the attention and imagination of many quantitatively-minded scientists. Of interest to us is the key role played by the EM algorithm during this transformation. We survey the use of the EM algorithm in a few important computational biology problems surrounding the “central dogma” of molecular biology: from DNA to RNA and then to proteins. Topics of this article include sequence motif discovery, protein sequence alignment, population genetics, evolutionary models and mRNA expression microarray data analysis.
The EM algorithm is a special case of a more general algorithm called the MM algorithm. Specific MM algorithms often have nothing to do with missing data. The first M step of an MM algorithm creates a surrogate function that is optimized in the second M step. In minimization, MM stands for majorize–minimize; in maximization, it stands for minorize–maximize. This two-step process always drives the objective function in the right direction. Construction of MM algorithms relies on recognizing and manipulating inequalities rather than calculating conditional expectations. This survey walks the reader through the construction of several specific MM algorithms. The potential of the MM algorithm in solving high-dimensional optimization and estimation problems is its most attractive feature. Our applications to random graph models, discriminant analysis and image restoration showcase this ability.
It was known from Metropolis et al. [J. Chem. Phys.21 (1953) 1087–1092] that one can sample from a distribution by performing Monte Carlo simulation from a Markov chain whose equilibrium distribution is equal to the target distribution. However, it took several decades before the statistical community embraced Markov chain Monte Carlo (MCMC) as a general computational tool in Bayesian inference. The usual reasons that are advanced to explain why statisticians were slow to catch on to the method include lack of computing power and unfamiliarity with the early dynamic Monte Carlo papers in the statistical physics literature. We argue that there was a deeper reason, namely, that the structure of problems in the statistical mechanics and those in the standard statistical literature are different. To make the methods usable in standard Bayesian problems, one had to exploit the power that comes from the introduction of judiciously chosen auxiliary variables and collective moves. This paper examines the development in the critical period 1980–1990, when the ideas of Markov chain simulation from the statistical physics literature and the latent variable formulation in maximum likelihood computation (i.e., EM algorithm) came together to spark the widespread application of MCMC methods in Bayesian computation.
Two major ideas in the analysis of missing data are (a) the EM algorithm [Dempster, Laird and Rubin, J. Roy. Statist. Soc. Ser. B39 (1977) 1–38] for maximum likelihood (ML) estimation, and (b) the formulation of models for the joint distribution of the data Z and missing data indicators M, and associated “missing at random” (MAR) condition under which a model for M is unnecessary [Rubin, Biometrika63 (1976) 581–592]. Most previous work has treated Z and M as single blocks, yielding selection or pattern-mixture models depending on how their joint distribution is factorized. This paper explores “block-sequential” models that interleave subsets of the variables and their missing data indicators, and then make parameter restrictions based on assumptions in each block. These include models that are not MAR. We examine a subclass of block-sequential models we call block-conditional MAR (BCMAR) models, and an associated block-monotone reduced likelihood strategy that typically yields consistent estimates by selectively discarding some data. Alternatively, full ML estimation can often be achieved via the EM algorithm. We examine in some detail BCMAR models for the case of two multinomially distributed categorical variables, and a two block structure where the first block is categorical and the second block arises from a (possibly multivariate) exponential family distribution.
This EM review article focuses on parameter expansion, a simple technique introduced in the PX-EM algorithm to make EM converge faster while maintaining its simplicity and stability. The primary objective concerns the connection between parameter expansion and efficient inference. It reviews the statistical interpretation of the PX-EM algorithm, in terms of efficient inference via bias reduction, and further unfolds the PX-EM mystery by looking at PX-EM from different perspectives. In addition, it briefly discusses potential applications of parameter expansion to statistical inference and the broader impact of statistical thinking on understanding and developing other iterative optimization algorithms.
In 1866 Gregor Mendel published a seminal paper containing the foundations of modern genetics. In 1936 Ronald Fisher published a statistical analysis of Mendel’s data concluding that “the data of most, if not all, of the experiments have been falsified so as to agree closely with Mendel’s expectations.” The accusation gave rise to a controversy which has reached the present time. There are reasonable grounds to assume that a certain unconscious bias was systematically introduced in Mendel’s experimentation. Based on this assumption, a probability model that fits Mendel’s data and does not offend Fisher’s analysis is given. This reconciliation model may well be the end of the Mendel–Fisher controversy.
George G. Roussas was born in the city of Marmara in central Greece, on June 29, 1933. He received a B.A. with high honors in Mathematics from the University of Athens in 1956, and a Ph.D. in Statistics from the University of California, Berkeley, in 1964. In 1964–1966, he served as Assistant Professor of Mathematics at the California State University, San Jose, and he was a faculty member of the Department of Statistics at the University of Wisconsin, Madison, in 1966–1976, starting as an Assistant Professor in 1966, becoming a Professor in 1972. He was a Professor of Applied Mathematics and Director of the Laboratory of Applied Mathematics at the University of Patras, Greece, in 1972–1984. He was elected Dean of the School of Physical and Mathematical Sciences at the University of Patras in 1978, and Chancellor of the university in 1981. He served for about three years as Vice President-Academic Affairs of the then new University of Crete, Greece, in 1981–1985. In 1984, he was a Visiting Professor in the Intercollege Division of Statistics at the University of California, Davis, and he was appointed Professor, Associate Dean and Chair of the Graduate Group in Statistics in the same university in 1985; he served in the two administrative capacities in 1985–1999. He is an elected member of the International Statistical Institute since 1974, a Fellow of the Royal Statistical Society since 1975, a Fellow of the Institute of Mathematical Statistics since 1983, and a Fellow of the American Statistical Association since 1986. He served as a member of the Council of the Hellenic Mathematical Society, and as President of the Balkan Union of Mathematicians. He is a Distinguished Professor of Statistics at the University of California, Davis, since 2003, the Chair of the Advisory Board of the “Demokritos Society of America” (a Think Tank) since 2007, a Fellow of the American Association for the Advancement of Science since 2008, and a Corresponding Member of the Academy of Athens in the field of Mathematical Statistics, elected by the membership in the plenary session of April 17, 2008.