Bayesian Analysis

Real-Time Bayesian Parameter Estimation for Item Response Models

Ruby Chiu-Hsing Weng and D. Stephen Coad

Full-text: Open access


Bayesian item response models have been used in modeling educational testing and Internet ratings data. Typically, the statistical analysis is carried out using Markov Chain Monte Carlo methods. However, these may not be computationally feasible when real-time data continuously arrive and online parameter estimation is needed. We develop an efficient algorithm based on a deterministic moment-matching method to adjust the parameters in real-time. The proposed online algorithm works well for two real datasets, achieving good accuracy but with considerably less computational time.

Article information

Bayesian Anal., Volume 13, Number 1 (2018), 115-137.

First available in Project Euclid: 19 December 2016

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bayesian inference deterministic method moment matching online algorithm Woodroofe–Stein’s identity

Creative Commons Attribution 4.0 International License.


Weng, Ruby Chiu-Hsing; Coad, D. Stephen. Real-Time Bayesian Parameter Estimation for Item Response Models. Bayesian Anal. 13 (2018), no. 1, 115--137. doi:10.1214/16-BA1043.

Export citation


  • Albert, J. (2015). “Introduction to Bayesian item response modeling.” International Journal of Quantitative Research in Education, 2(3–4): 178–193.
  • Albert, J. H. (1992). “Bayesian estimation of normal ogive item response curves using Gibbs sampling.” Journal of Educational and Behavioral Statistics, 17(3): 251–269.
  • Andrich, D. (1978). “A rating formulation for ordered response categories.” Psychometrika, 43: 561–573.
  • Bishop, C. M. (2008). “A New Framework for Machine Learning.” In Lecture Notes in Computer Science LNCS 5050, 1–24. Springer.
  • Bock, R. D. and Aitkin, M. (1981). “Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm.” Psychometrika, 46: 443–459.
  • Boyen, X. and Koller, D. (1998). “Tractable inference for complex stochastic processes.” In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, 33–42. Morgan Kaufmann Publishers Inc.
  • Chen, X., Bennett, P. N., Collins-Thompson, K., and Horvitz, E. (2013). “Pairwise ranking aggregation in a crowdsourced setting.” In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, 193–202. ACM.
  • Chevalier, J. A. and Mayzlin, D. (2006). “The effect of word of mouth on sales: Online book reviews.” Journal of Marketing Research, 43(3): 345–354.
  • Dangauthier, P., Herbrich, R., Minka, T., and Graepel, T. (2008). “TrueSkill through time: Revisiting the history of chess.” In Advances in Neural Information Processing Systems 20. Cambridge, MA: MIT Press.
  • De Ayala, R. J. (2013). The Theory and Practice of Item Response Theory. Guilford Publications.
  • de Freitas, J. F., Niranjan, M., et al. (2000). “Hierarchical Bayesian models for regularization in sequential learning.” Neural Computation, 12(4): 933–953.
  • Embretson, S. E. (1991). “A multidimensional latent trait model for measuring learning and change.” Psychometrika, 56: 495–515.
  • Faes, C., Ormerod, J. T., and Wand, M. P. (2011). “Variational Bayesian inference for parametric and nonparametric regression with missing data.” Journal of the American Statistical Association, 106: 959–971.
  • Fox, J.-P. and Glas, C. A. W. (2001). “Bayesian estimation of a multilevel IRT model using Gibbs sampling.” Psychometrika, 66: 269–286.
  • Glen, A. G., Leemis, L. M., and Drew, J. H. (2004). “Computing the distribution of the product of two continuous random variables.” Computational Statistics and Data Analysis, 44(3): 451–464.
  • Glickman, M. E. (1999). “Parameter estimation in large dynamic paired comparison experiments.” Applied Statistics, 48(3): 377–394.
  • Glickman, M. E. and Stern, H. S. (1998). “A state-space model for National Football League scores.” Journal of the American Statistical Association, 93: 25–35.
  • Hall, P., Pham, T., Wand, M. P., and Wang, S. S. J. (2011). “Asymptotic normality and valid inference for Gaussian variational approximation.” The Annals of Statistics, 39(5): 2502–2532.
  • Herbrich, R., Minka, T., and Graepel, T. (2007). “TrueSkill$^{\rm TM}$: A Bayesian skill rating system.” In Schölkopf, B., Platt, J., and Hoffman, T. (eds.), Advances in Neural Information Processing Systems 19, 569–576. Cambridge, MA: MIT Press.
  • Ho, D. E. and Quinn, K. M. (2008a). “Improving the presentation and interpretation of online ratings data with model-based figures.” The American Statistician, 62(4): 279–288.
  • Ho, D. E. and Quinn, K. M. (2008b). “Measuring explicit political positions of media.” Quarterly Journal of Political Science, 3: 353–377.
  • James, W. and Stein, C. (1961). “Estimation with quadratic loss.” In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, volume I, 361–379. Univ. California Press, Berkeley.
  • Jazwinski, A. H. (1969). “Adaptive filtering.” Automatica, 5(4): 475–485.
  • Johnson, V. E. and Albert, J. H. (1999). Ordinal Data Modeling. New York: Springer.
  • Kim, S.-H. (2001). “An evaluation of a Markov chain Monte Carlo method for the Rasch model.” Applied Psychological Measurement, 25(2): 163–176.
  • Koren, Y., Bell, R., and Volinsky, C. (2009). “Matrix factorization techniques for recommender systems.” IEEE Computer, 42(8): 30–37.
  • Liu, Y. (2006). “Word of mouth for movies: Its dynamics and impact on box office revenue.” Journal of Marketing, 70(3): 74–89.
  • MacKay, D. J. (1992). “The evidence framework applied to classification networks.” Neural Computation, 4(5): 720–736.
  • Martin, A. D. and Quinn, K. M. (2002). “Dynamic ideal point estimation via Markov chain Monte Carlo for the U.S. supreme count, 1953-1999.” Political Analysis, 10: 134–152.
  • Masters, G. N. (1982). “A Rasch model for partial credit scoring.” Psychometrika, 47(2): 149–174.
  • Maybeck, P. S. (1982). Stochastic Models, Estimation, and Control. Academic Press.
  • Minka, T. (2001). “A family of algorithms for approximate Bayesian inference.” Ph.D. thesis, MIT.
  • Molenaar, I. W. (1995). “Estimation of item parameters.” In Fischer, G. and Molenaar, I. (eds.), Rasch Models: Foundations, Recent Developments, and Applications, 39–51. Springer Verlag.
  • Muraki, E. (1990). “Fitting a polytomous item response model to Likert-type data.” Applied Psychological Measurement, 14(1): 59–71.
  • Patz, R. J. and Junker, B. (1999). “A straightforward approach to Markov Chain Monte Carlo methods for item response models.” Journal of Educational and Behavioral Statistics, 24: 146–178.
  • Rasch, G. (1961). “On general laws and the meaning of measurement in psychology.” In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, volume IV, 321–333. Univ. California Press, Berkeley.
  • Robbins, H. and Monro, S. (1951). “A stochastic approximation method.” The Annals of Mathematical Statistics, 22(3): 400–407.
  • Rosenblatt, F. (1958). “The perceptron: A probabilistic model for information storage and organization in the brain.” Psychological Reviews, 7: 551–585.
  • Rue, H., Martino, S., and Chopin, N. (2009). “Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations.” Journal of the Royal Statistical Society Series B, 71: 319–392.
  • Saad, D. (editor) (1998). On-Line Learning in Neural Networks. Cambridge University Press.
  • Samejima, F. (1969). “Estimation of latent ability using a response pattern of graded scores.” Psychometrika Monograph Supplement, 71(4): 1–100.
  • Shalev-Shwartz, S. (2011). “Online learning and online convex optimization.” Foundations and Trends$^{\circledR}$ in Machine Learning, 4(2): 107–194.
  • Spiegelhalter, D. J. and Lauritzen, S. L. (1990). “Sequential updating of conditional probabilities on directed graphical structures.” Networks, 20(5): 579–605.
  • Stein, C. (1981). “Estimation of the mean of a multivariate normal distribution.” The Annals of Statistics, 9: 1135–1151.
  • Thurstone, L. L. (1927). “A law of comparative judgement.” Psychological Reviews, 34: 273–286.
  • van der Linden, W. J. and Hambleton, R. K. (2013). Handbook of Modern Item Response Theory. Springer Science and Business Media.
  • Wang, X., Berger, J. O., and Burdick, D. S. (2013). “Bayesian analysis of dynamic item response models in educational testing.” The Annals of Applied Statistics, 7(1): 126–153.
  • Weng, R. C. (2010). “A Bayesian Edgeworth expansion by Stein’s Identity.” Bayesian Analysis, 5(4): 741–764.
  • Weng, R. C. (2015). “Expansions for multivariate densities.” Journal of Statistical Planning and Inference, 167: 174–181.
  • Weng, R. C. and Coad, D. S. (2016). “Supplementary Material for “Real-Time Bayesian Parameter Estimation for Item Response Models”.” Bayesian Analysis.
  • Weng, R. C. and Lin, C.-J. (2011). “A Bayesian approximation method for online ranking.” Journal of Machine Learning Research, 12: 267–300.
  • Wistuba, M., Schaefers, L., and Platzner, M. (2012). “Comparison of Bayesian Move Prediction Systems for Computer Go.” In 2012 IEEE Conference on Computational Intelligence and Games (CIG), 91–99. IEEE.
  • Woodroofe, M. (1989). “Very weak expansions for sequentially designed experiments: linear models.” The Annals of Statistics, 17: 1087–1102.
  • Woodroofe, M. and Coad, D. S. (1997). “Corrected confidence sets for sequentially designed experiments.” Statistica Sinica, 7: 53–74.

Supplemental materials