Statistical Science
- Statist. Sci.
- Volume 25, Number 4 (2010), 506-516.
From EM to Data Augmentation: The Emergence of MCMC Bayesian Computation in the 1980s
Martin A. Tanner and Wing H. Wong
Full-text: Open access
Abstract
It was known from Metropolis et al. [J. Chem. Phys. 21 (1953) 1087–1092] that one can sample from a distribution by performing Monte Carlo simulation from a Markov chain whose equilibrium distribution is equal to the target distribution. However, it took several decades before the statistical community embraced Markov chain Monte Carlo (MCMC) as a general computational tool in Bayesian inference. The usual reasons that are advanced to explain why statisticians were slow to catch on to the method include lack of computing power and unfamiliarity with the early dynamic Monte Carlo papers in the statistical physics literature. We argue that there was a deeper reason, namely, that the structure of problems in the statistical mechanics and those in the standard statistical literature are different. To make the methods usable in standard Bayesian problems, one had to exploit the power that comes from the introduction of judiciously chosen auxiliary variables and collective moves. This paper examines the development in the critical period 1980–1990, when the ideas of Markov chain simulation from the statistical physics literature and the latent variable formulation in maximum likelihood computation (i.e., EM algorithm) came together to spark the widespread application of MCMC methods in Bayesian computation.
Article information
Source
Statist. Sci., Volume 25, Number 4 (2010), 506-516.
Dates
First available in Project Euclid: 14 March 2011
Permanent link to this document
https://projecteuclid.org/euclid.ss/1300108234
Digital Object Identifier
doi:10.1214/10-STS341
Mathematical Reviews number (MathSciNet)
MR2807767
Zentralblatt MATH identifier
1329.65021
Keywords
Data augmentation EM algorithm MCMC
Citation
Tanner, Martin A.; Wong, Wing H. From EM to Data Augmentation: The Emergence of MCMC Bayesian Computation in the 1980s. Statist. Sci. 25 (2010), no. 4, 506--516. doi:10.1214/10-STS341. https://projecteuclid.org/euclid.ss/1300108234
References
- Achcar, J. A., Bolfarine, H. and Pericchi, L. R. (1987). Transformation of survival data to an extreme value distribution. J. Roy. Statist. Soc. Ser. D Statistician 36 229–234.
- Albert, J. (1988). Bayesian estimation of Poisson means using a hierarchical log-linear model. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 519–531. Oxford Univ. Press, Oxford.
- Bernardo, J. M., Degroot, M. H., Lindley, D. V. and Smith, A. F. M., eds. (1988). Bayesian Statistics 3. Oxford Univ. Press, Oxford.
- Besag, J. (1986). On the statistical analysis of dirty pictures. J. R. Stat. Soc. Ser. B Stat. Methodol. 48 259–302.
- Binder, K. (1978). Monte Carlo Methods in Statistical Physics. Springer, New York.
- Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 39 1–38.
- DuMouchel, W. (1988). A Bayesian model and a graphical elicitation procedure for multiple comparisons. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 127–146. Oxford Univ. Press, Oxford.Mathematical Reviews (MathSciNet): MR1008039
- Efron, B. (1979). Bootstrap methods: Another look at the Jackknife. Ann. Statist. 7 1–26.Mathematical Reviews (MathSciNet): MR515681
Zentralblatt MATH: 0406.62024
Digital Object Identifier: doi:10.1214/aos/1176344552
Project Euclid: euclid.aos/1176344552 - Gelfand, A. E., Hills, S. E., Racine-Poon, A. and Smith, A. F. M. (1990). Illustration of Bayesian inference in normal data models using Gibbs sampling. J. Amer. Statist. Assoc. 85 972–985.
- Gelfand, A. E. and Smith, A. F. M. (1990). Sampling-based approaches to calculating marginal densities. J. Amer. Statist. Assoc. 85 398–409.Mathematical Reviews (MathSciNet): MR1141740
Zentralblatt MATH: 0702.62020
Digital Object Identifier: doi:10.2307/2289776
JSTOR: links.jstor.org - Gelman, A. and King, G. (1990). Estimating the electoral consequences of legislative redistricting. J. Amer. Statist. Assoc. 85 274–282.
- Geman, S. (1988a). Experiments in Bayesian image analysis. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 159–171. Oxford Univ. Press, Oxford.
- Geman, S. (1988b). Stochastic relaxation methods for image restoration and expert systems. In Maximum Entropy and Bayesian Methods in Science and Engineering (Vol. 2) (G. J. Erickson and C. R. Smith, eds.). Kluwer, New York.
- Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6 721–741.
- Geweke, J. (1989). Bayesian inference in econometric models using Monte Carlo integration. Econometrica 57 1317–1339.Mathematical Reviews (MathSciNet): MR1035115
Digital Object Identifier: doi:10.2307/1913710
JSTOR: links.jstor.org - Geyer, C. J. (1991). Markov chain Monte Carlo maximum likelihood. In Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface (E. Keramidas, ed.) 156–163. Interface Foundation, Fairfax Station.
- Geyer, C. J. (1995). Conditioning in Markov chain Monte Carlo. J. Comput. Graph. Statist. 4 148–154.Mathematical Reviews (MathSciNet): MR1341319
Digital Object Identifier: doi:10.2307/1390763
JSTOR: links.jstor.org - Goel, P. K. (1988). Software for Bayesian analysis: Current status and additional need. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 173–188. Oxford Univ. Press, Oxford.
- Grieve, A. P. (1987). Applications of Bayesian software: Two examples. J. Roy. Statist. Soc. Ser. D Statistician 36 283–288.
- Grieve, A. P. (1988). A Bayesian approach to the analysis of LD50 experiments. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 617–630. Oxford Univ. Press, Oxford.
- Gubernatis, J. E., ed. (2003). The Monte Carlo Method in the Physical Sciences: Celebrating the 50th Anniversary of the Metropolis Algorithm. Amer. Inst. Phys., New York.
- Hammersley, J. M. and Handscomb, D. C. (1964). Monte Carlo Methods, 2nd ed. Chapman and Hall, London.Mathematical Reviews (MathSciNet): MR223065
- Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57 97–109.
- Hitchcock, D. B. (2003). A history of the Metropolis–Hastings algorithm. Amer. Statist. 57 254–257.
- Hukushima, K. and Nemoto, K. (1996). Exchange Monte Carlo method and application to spin glass simulations. J. Phys. Soc. Japan 65 1604–1608.
- Karlin, S. and Taylor, H. M. (1975). A First Course in Stochastic Processes, 2nd ed. Academic Press, New York.Mathematical Reviews (MathSciNet): MR356197
- Kass, R. E. (1997). Review of “Markov chain Monte Carlo in practice.” J. Amer. Statist. Assoc. 92 1645–1646.
- Kass, R. E., Tierney, L. and Kadane, J. B. (1988). Asymptotics in Bayesian computation. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 261–278. Oxford Univ. Press, Oxford.
- Kim, C. E. and Schervish, M. J. (1988). Stochastic models of incarceration careers. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 279–305. Oxford Univ. Press, Oxford.
- Kloek, T. and van Dijk, H. K. (1978). Bayesian estimates of equation system parameters: An application of integration by Monte Carlo. Econometrica 46 1–19.
- Kloek, T. and van Dijk, H. K. (1980). Further experience in Bayesian analysis using Monte Carlo integration. J. Econometrics 14 307–328.
- Li, K. H. (1988). Imputation using Markov chains. J. Statist. Comput. Simul. 30 57–79.Mathematical Reviews (MathSciNet): MR1005883
Zentralblatt MATH: 0726.62017
Digital Object Identifier: doi:10.1080/00949658808811085 - Liu, C., Rubin, D. B. and Wu, Y. N. (1998). Parameter expansion to accelerate EM: The PX-EM algorith. Biometrika 85 755–770.Mathematical Reviews (MathSciNet): MR1666758
Zentralblatt MATH: 0921.62071
Digital Object Identifier: doi:10.1093/biomet/85.4.755
JSTOR: links.jstor.org - Liu, J. S., Wong, W. H. and Kong, A. (1994). Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika 81 27–40.Mathematical Reviews (MathSciNet): MR1279653
Zentralblatt MATH: 0811.62080
Digital Object Identifier: doi:10.1093/biomet/81.1.27
JSTOR: links.jstor.org - Liu, J. S. and Wu, Y. N. (1999). Parameter expansion scheme for data augmentation. J. Amer. Statist. Assoc. 94 1264–1274.Mathematical Reviews (MathSciNet): MR1731488
Zentralblatt MATH: 1069.62514
Digital Object Identifier: doi:10.2307/2669940
JSTOR: links.jstor.org - Lunn, D., Spiegelhalter, D. J., Thomas, A. and Best, N. (2009). The BUGS project: Evolution, critique and future directions. Stat. Med. 28 3049–3067.
- Marriott, J. (1987). Bayesian numerical and graphical methods for Box–Jenkins time series. J. Roy. Statist. Soc. Ser. D Statistician 36 265–268.
- Marriott, J. (1988). Reparametrization for Bayesian inference in ARMA time series. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 701–704. Oxford Univ. Press, Oxford.
- Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. and Teller, E. (1953). Equation of state calculations by fast computing machines. J. Chem. Phys. 21 1087–1092.
- Morris, C. N. (1987). Comment on “The calculation of posterior distributions by data augmentation” by M. A. Tanner and W. H. Wong. J. Amer. Statist. Assoc. 82 542–543.Mathematical Reviews (MathSciNet): MR898357
Zentralblatt MATH: 0619.62029
Digital Object Identifier: doi:10.2307/2289457
JSTOR: links.jstor.org - Morris, C. N. (1988). Approximating posterior distributions and posterior moments. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 327–344. Oxford Univ. Press, Oxford.
- Naylor, J. C. (1987). Bayesian alternatives to t-tests. J. Roy. Statist. Soc. Ser. D Statistician 36 241–246.
- Naylor, J. C. and Smith, A. F. M. (1982). Applications of a method for the efficient computation of posterior distributions. J. Roy. Statist. Soc. Ser. C Appl. Statist. 31 214–225.Mathematical Reviews (MathSciNet): MR694917
Digital Object Identifier: doi:10.2307/2347995
JSTOR: links.jstor.org - O’Hagan, A. (1987). Monte Carlo is fundamentally unsound. J. Roy. Statist. Soc. Ser. D Statistician 36 247–249.
- Pearl, J. (1987). Evidential reasoning using stochastic simulation of causal models. Artif. Intell. 32 245–257.Mathematical Reviews (MathSciNet): MR885357
Digital Object Identifier: doi:10.1016/0004-3702(87)90012-9 - Poirier, D. J. (1988). Bayesian diagnostic testing in the general linear normal regression model. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 725–732. Oxford Univ. Press, Oxford.
- Pole, A. (1988). Transfer response models: a numerical approach. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 733–745. Oxford Univ. Press, Oxford.
- Ripley, B. D. (1987). Stochastic Simulation. Wiley, New York.Mathematical Reviews (MathSciNet): MR875224
- Robert, C. and Casella, G. (2010). A short history of Markov chain Monte Carlo—subjective recollections from incomplete data. In Handbook on Markov Chain Monte Carlo. Chapman and Hall/CRC Press, Boca Raton, FL.
- Rubin, D. B. (1988). Using the SIR algorithm to simulate posterior distributions. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 395–402. Oxford Univ. Press, Oxford.
- Rubinstein, R. Y. (1981). Simulation and the Monte Carlo Method, 1st ed. Wiley, New York.Mathematical Reviews (MathSciNet): MR624270
- Schnatter, S. (1988). Bayesian forecasting of time series by Gaussian sum approximation. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 757–764. Oxford Univ. Press, Oxford.
- Shaw, J. E. H. (1987). Numerical Bayesian analysis of some flexible regression models. J. Roy. Statist. Soc. Ser. D Statistician 36 147–153.
- Shaw, J. E. H. (1988a). A quasirandom approach to integration in Bayesian statistics. Ann. Statist. 16 895–914.Mathematical Reviews (MathSciNet): MR947584
Zentralblatt MATH: 0645.62043
Digital Object Identifier: doi:10.1214/aos/1176350842
Project Euclid: euclid.aos/1176350842 - Shaw, J. E. H. (1988b). Aspects of numerical integration and summarisation. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 411–428. Oxford Univ. Press, Oxford.
- Smith, A. F. M. (1988). What should be Bayesian about Bayesian software? In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 429–435. Oxford Univ. Press, Oxford.Zentralblatt MATH: 0702.00028
- Smith, A. F. M. (1991). Bayesian computational methods. Philos. Trans. Roy. Soc. Lond. Ser. A 337 369–386.Mathematical Reviews (MathSciNet): MR1143728
Digital Object Identifier: doi:10.1098/rsta.1991.0130
JSTOR: links.jstor.org - Smith, A. F. M., Skene, A. M., Shaw, J. E. H. and Naylor, J. C. (1987). Progress with numerical and graphical methods for practical Bayesian statistics. J. Roy. Statist. Soc. Ser. D Statistician 36 75–82.
- Smith, A. F. M., Skene, A. M., Shaw, J. E. H., Naylor, J. C. and Dransfield, M. (1985). The implementation of the Bayesian paradigm. Commun. Stat. Theory Methods 14 1079–1102.Mathematical Reviews (MathSciNet): MR797634
Digital Object Identifier: doi:10.1080/03610928508828963 - Spiegelhalter, D. J. (1987). Coherent evidence propagation in expert systems. J. Roy. Statist. Soc. Ser. D Statistician 36 201–210.
- Spiegelhalter, D. J. and Lauritzen, S. L. (1990). Sequential updating of conditional probabilities on directed graphical structures. Networks 20 579–605.
- Stewart, L. (1987). Hierarchical Bayesian analysis using Monte Carlo integration: Computing posterior distributions when there are many possible models. J. Roy. Statist. Soc. Ser. D Statistician 36 211–219.
- Sweeting, T. J. (1988). Approximate posterior distributions in censored regression models. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 791–799. Oxford Univ. Press, Oxford.
- Swendsen, R. H. and Wang, J. S. (1987). Nonuniversal critical dynamics in Monte Carlo simulations. Phys. Rev. Lett. 58 86–88.
- Tanner, M. A. and Wong, W. H. (1987). The calculation of posterior distributions by data augmentation (with discussion). J. Amer. Statist. Assoc. 82 528–550.Mathematical Reviews (MathSciNet): MR898357
Zentralblatt MATH: 0619.62029
Digital Object Identifier: doi:10.2307/2289457
JSTOR: links.jstor.org - Tierney, L. and Kadane, J. B. (1986). Accurate approximations for posterior moments and marginal densities. J. Amer. Statist. Assoc. 81 82–86.Mathematical Reviews (MathSciNet): MR830567
Zentralblatt MATH: 0587.62067
Digital Object Identifier: doi:10.2307/2287970
JSTOR: links.jstor.org - van der Merwe, A. J. and Groenewald, P. C. N. (1987). Bayes and empirical Bayes confidence intervals in applied research. J. Roy. Statist. Soc. Ser. D Statistician 36 171–179.
- van Dijk, H. K. (1988). Discussion of Goel. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 187–188. Oxford Univ. Press, Oxford.
- van Dijk, H. K., Hop, J. P. and Louter, A. S. (1987). An algorithm for the computation of posterior moments and densities using simple importance sampling. J. Roy. Statist. Soc. Ser. D Statistician 36 83–90.
- van Dyk, D. A. and Meng, X. L. (2001). The art of data augmentation. J. Comput. Graph. Statist. 10 1–50.Mathematical Reviews (MathSciNet): MR1936358
Digital Object Identifier: doi:10.1198/10618600152418584
JSTOR: links.jstor.org - Zellner, A. (1988). A Bayesian era. In Bayesian Statistics 3 (J. M. Bernardo, M. H. Degroot, D. V. Lindley and A. F. M. Smith, eds.) 509–516. Oxford Univ. Press, Oxford.Mathematical Reviews (MathSciNet): MR1008063

- You have access to this content.
- You have partial access to this content.
- You do not have access to this content.
More like this
- EM versus Markov chain Monte Carlo for estimation of hidden Markov models: a
computational perspective
Rydén, Tobias, Bayesian Analysis, 2008 - On Russian Roulette Estimates for Bayesian Inference with Doubly-Intractable Likelihoods
Lyne, Anne-Marie, Girolami, Mark, Atchadé, Yves, Strathmann, Heiko, and Simpson, Daniel, Statistical Science, 2015 - Error bounds for sequential Monte Carlo samplers for multimodal distributions
Paulin, Daniel, Jasra, Ajay, and Thiery, Alexandre, Bernoulli, 2019
- EM versus Markov chain Monte Carlo for estimation of hidden Markov models: a
computational perspective
Rydén, Tobias, Bayesian Analysis, 2008 - On Russian Roulette Estimates for Bayesian Inference with Doubly-Intractable Likelihoods
Lyne, Anne-Marie, Girolami, Mark, Atchadé, Yves, Strathmann, Heiko, and Simpson, Daniel, Statistical Science, 2015 - Error bounds for sequential Monte Carlo samplers for multimodal distributions
Paulin, Daniel, Jasra, Ajay, and Thiery, Alexandre, Bernoulli, 2019 - Generalised linear mixed model analysis via sequential Monte Carlo sampling
Fan, Y., Leslie, D.S., and Wand, M.P., Electronic Journal of Statistics, 2008 - Sequential Monte Carlo Samplers with Independent Markov Chain Monte Carlo Proposals
South, L. F., Pettitt, A. N., and Drovandi, C. C., Bayesian Analysis, 2019 - Likelihood-free estimation of model evidence
Didelot, Xavier, Everitt, Richard G., Johansen, Adam M., and Lawson, Daniel J., Bayesian Analysis, 2011 - Establishing some order amongst exact approximations of MCMCs
Andrieu, Christophe and Vihola, Matti, The Annals of Applied Probability, 2016 - Support points
Mak, Simon and Joseph, V. Roshan, The Annals of Statistics, 2018 - A one-pass sequential Monte Carlo method for Bayesian analysis of massive
datasets
Balakrishnan, Suhrid and Madigan, David, Bayesian Analysis, 2006 - Approximate Bayesian Computation by Modelling Summary Statistics in a Quasi-likelihood Framework
Cabras, Stefano, Castellanos Nueda, Maria Eugenia, and Ruli, Erlis, Bayesian Analysis, 2015