Statistical Science

Likelihood Inference for Models with Unobservables: Another View

Youngjo Lee and John A. Nelder

Full-text: Open access


There have been controversies among statisticians on (i) what to model and (ii) how to make inferences from models with unobservables. One such controversy concerns the difference between estimation methods for the marginal means not necessarily having a probabilistic basis and statistical models having unobservables with a probabilistic basis. Another concerns likelihood-based inference for statistical models with unobservables. This needs an extended-likelihood framework, and we show how one such extension, hierarchical likelihood, allows this to be done. Modeling of unobservables leads to rich classes of new probabilistic models from which likelihood-type inferences can be made naturally with hierarchical likelihood.

Article information

Statist. Sci., Volume 24, Number 3 (2009), 255-269.

First available in Project Euclid: 31 March 2010

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Hierarchical generalized linear model unobservables random effects likelihood extended likelihood hierarchical likelihood


Lee, Youngjo; Nelder, John A. Likelihood Inference for Models with Unobservables: Another View. Statist. Sci. 24 (2009), no. 3, 255--269. doi:10.1214/09-STS277.

Export citation


  • Ainsworth, L. M. and Dean, C. B. (2006). Approximate inference for disease mapping. Comp. Statist. Data Anal. 50 2552–2570.
  • Bartholomew, D. J. (1987). Latent Variable Models and Factor Analysis. Oxford Univ. Press, Oxford.
  • Bayarri, M. J., DeGroot, M. H. and Kadane, J. B. (1988). What is the likelihood function? (with discussion). In Statistical Decision Theory and Related Topics IV. Vol. 1. (S. S. Gupta and J. O. Berger, eds.). Springer, New York.
  • Bedrick, E. J. and Hill, J. R. (1999). Properties and applications of the generalized likelihood as a summary function for prediction problems. Scand. J. Statist. 26 593–609.
  • Berger, J. O. (1985). Statistical Decision Theory and Bayesian Analysis. Springer, New York.
  • Berger, J. O. and Wolpert, R. (1984). The Likelihood Principle. IMS, Hayward, CA.
  • Besag, J. and Higdon, P. (1999). Bayesian analysis of agricultural field experiments (with discussion). J. Roy. Statist. Soc. Ser. B 61 3–66.
  • Besag, J. and Kooperberg, C. (1995). On conditional and intrinsic autoregressions. Biometrika 82 783–746.
  • Birnbaum, A. (1962). On the foundations of statistical inference (with discussion). J. Amer. Statist. Assoc. 57 269–306.
  • Bjørnstad, J. F. (1990). Predictive likelihood principle: A review (with discussion). Statist. Sci. 5 242–265.
  • Bjørnstad, J. F. (1996). On the generalization of the likelihood function and likelihood principle. J. Amer. Statist. Assoc. 91 791–806.
  • Booth J. G. and Hobert, J. P. (1998). Standard errors of prediction in generalized linear mixed models. J. Amer. Statist. Assoc. 93 262–272.
  • Breslow, N. E. and Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. J. Amer. Statist. Assoc. 88 9–25.
  • Carlin, B. P. and Gelfand, A. E. (1990). Approaches for empirical Bayesian confidence intervals. J. Amer. Statist. Assoc. 84 717–726.
  • Carlin, B. P. and Louis, T. A. (2000). Bayesian and Empirical Bayesian Methods for Data Analysis. Chapman and Hall, London.
  • Castillo, J. and Lee, Y. (2008). GLM method for volatility models. Stat. Model. 8 263–283.
  • Chaganty, N. R. and Joe, H. (2006). Range of correlation matrices for dependent Bernoulli random variables. Biometrika 93 197–206.
  • Cox, D. R. (1958). The interpretation of the effects of non-additivity in the Latin square. Biometrika 45 69–73.
  • Cox, D. R. and Reid, N. (1987). Parameter orthogonality and approximate conditional inference. J. Roy. Statist. Soc. Ser. B 49 1–39.
  • Crowder, M. J. (1995). On the use a working correlation matrix in using generalised linear models for repeated measures. Biometrika 82 407–410.
  • Davison A. C. (1986). Approximate predictive likelihood. Biometrika 73 323–332.
  • Diggle, P. J., Liang, K. Y. and Zeger, S. L. (1996). Analysis of Longitudinal Data. Oxford Univ. Press, New York.
  • Drum, M. L. and McCullagh, P. (1993). REML estimation with exact covariance in the logistic mixed model. Biometrics 49 677–689.
  • Efron, B. (2003). A conversation with good friends. Statist. Sci. 16 55–57.
  • Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with B-splines and penalties. Statist. Sci. 11 89–121.
  • Fisher, R. A. (1921). On the probable error of a coefficient of correlation deduced from a small sample. Metron 1 3–32.
  • Glidden, D. and Liang, K. Y. (2002). Ascertainment adjustment in complex diseases. Genetic Epidemiology 23 201–208.
  • Green, P. J. and Silverman, B. W. (1994). Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach. Chapman and Hall, London.
  • Ha, I. D. and Lee, Y. (2005). Comparison of hierarchical likelihood versus orthodox BLUP approaches for frailty models. Biometrika 92 717–723.
  • Ha, I. D., Lee, Y. and Song, J.-K. (2002). Hierarchical likelihood approach for mixed linear models with censored data. Lifetime Data Anal. 8 163–176.
  • Heagerty, P. J. and Zeger, S. (2000). Marginalized multilevel models and likelihood inference (with discussion). Statist. Sci. 15 1–26.
  • Jang, M., Lee, Y., Lawson, A. and Browne, W. (2007). A comparison of the hierarchical likelihood and Bayesian approaches to spatial epidemiological modelling. Environmetrics 18 809–821.
  • Jansen, I., Beunckens, C., Molenberghs, G., Verberke, G. and Mallinckrodt, C. (2006). Analyzing incomplete discrete longitudinal clinical trial data. Statist. Sci. 21 52–69.
  • Joe, H. (1997). Multivariate Models and Dependence Concepts. Chapman and Hall, London.
  • Kass, R. E. and Steffey, D. (1989). Approximate Bayesian inference in conditionally independent hierarchical models (parametric empirical Bayes models). J. Amer. Statist. Assoc. 84 717–726.
  • Laird N. M. and Louis, T. A. (1987). Empirical Bayes confidence intervals based on Bootstrap samples. J. Amer. Statist. Assoc. 82 739–750.
  • Lee, Y. and Ha, I. D. (2010). Orthodox BLUP versus h-likelihood methods for inferences about random effects in Tweedie mixed models. Statist. Comp. To appear.
  • Lee, Y., Jang, M. and Lee, W. (2007). Hierarchical likelihood approach to standard errors of prediction in disease mapping. A paper prepared for submission.
  • Lee, Y. and Nelder, J. A. (1996). Hierarchical generalized linear models (with discussion). J. Roy. Statist. Soc. Ser. B 58 619–678.
  • Lee, Y. and Nelder, J. A. (2001a). Hierarchical generalised linear models: A synthesis of generalised linear models, random effect models and structured dispersions. Biometrika 88 987–1006.
  • Lee, Y. and Nelder, J. A. (2001b). Modelling and analysing correlated non-normal data. Stat. Model. 1 3–16.
  • Lee, Y. and Nelder, J. A. (2002). Analysis of the ulcer data using hierarchical generalised linear models. Stat. Med. 21 191–202.
  • Lee, Y. and Nelder, J. A. (2004). Conditional and marginal models: Another view (with discussion). Statist. Sci. 19 219–238.
  • Lee, Y. and Nelder, J. A. (2005). Likelihood for random-effects (with discussion). Statistical and Operational Research Transactions 29 141–182.
  • Lee, Y. and Nelder, J. A. (2006a). Double hierarchical generalized linear models (with discussion). Appl. Statist. 55 139–185.
  • Lee, Y. and Nelder, J. A. (2006b). Fitting via alternative random-effect models. Statist. Comp. 16 69–75.
  • Lee, Y, Nelder, J. A. and Pawitan, Y. (2006). Generalized Linear Models with Random Effects: Unified Analysis via H-Likelihood. Chapman and Hall, London.
  • Leroux, B. G., Lin, X. and Breslow, N. (1999). Estimation of disease rates in small areas: A new mixed model for spatial dependence. In Statistical Models in Epidemiology, the Environment and Clinical Trials (M. E. Halloran and D. Berry, eds.) 135–178. Springer, New York.
  • Lin, X. and Breslow, N. E. (1996). Bias correction in generalised linear mixed models with multiple components of dispersion. J. Amer. Statist. Assoc. 91 1007–1016.
  • Lindsey, J. K. and Lambert, P. (1998). On the appropriateness of marginal models for repeated measurements in clinical trials. Stat. Med. 17 447–469.
  • Lindley, D. V. and Smith, A. F. M. (1972). Bayesian estimates for the linear model (with discussion). J. Roy. Statist. Soc. Ser. B 34 1–41.
  • Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Data. Wiley, New York.
  • Ma, R. and Jorgensen, B. (2007). Nested generalized linear mixed models: Orthodox best linear unbiased predictor approach. J. Roy. Statist. Soc. Ser. B 69 625–641.
  • MacNab, Y. C., Farrell, P. J., Gustafson, P. and Wen. S. (2004). Estimation in Bayesian disease mapping. Biometrics 60 865–873.
  • Mathiasen P. E. (1979). Predictive function. Scand. J. Statist. 6 1–21.
  • McCullagh P. and Nelder, J. A. (1989). Generalized Linear Models. 2nd edn. Chapman and Hall, London.
  • Molenberghs, G., Beunckens, C., Sotto, C. and Kenward, M. G. (2008). Every missingness not at random model has a missingness at random counterpart with equal fit. J. Roy. Statist. Soc. Ser. B 70 371–388.
  • Molenberghs, G. and Lesaffre E. (1994). Marginal modelling of correlated ordinal data using a multivariate Plackett distribution. J. Amer. Statist. Assoc. 89 633–644.
  • Molenberghs, G., Verbeke, G. and Demétrio, C. (2007). An extended random-effects approach to modeling repeated, overdspersed count data. Lifetime Data Analysis 13 513–531.
  • Morris, C. N. (1983). Parametric empirical Bayes inference: Theory and application. J. Amer. Statist. Assoc. 78 47–59.
  • Nelder, J. A. (1954). The interpretation of negative components of variance. Biometrika 41 544–548.
  • Nelder, J. A. (1994). The statistics of linear models: Back to basics. Statist. Comp. 4 221–234.
  • Neyman, J. (1935). Statistical problems in agricultural experimentation (with discussion). J. Roy. Statist. Soc. Ser. B 2 (suppl.) 107–108.
  • Noh, M. and Lee, Y. (2007a). Robust modelling for inference from GLM classes. J. Amer. Statist. Assoc. 102 1059–1072.
  • Noh, M. and Lee, Y. (2007b). REML estimation for binary data in GLMMs. J. Multivariate Anal. 98 896–915.
  • Noh, M. and Lee, Y. (2008). Hierarchical-likelihood approach for nonlinear mixed-effects models. Comp. Stat. Data. Anal. 52 3517–3527.
  • Noh, M., Pawitan, Y. and Lee, Y. (2005). Robust ascertainment-adjusted parameter estimation. Gen. Epidem. 29 68–75.
  • Pawitan, Y. (2001). In All Likelihood: Statistical Modelling and Inference Using Likelihood. Clarendon Press, Oxford.
  • Pearson, K. (1920). The fundamental problems of practical statistics. Biometrika 13 1–16.
  • Rasch, G. (1960). Probabilistic Models for Some Intelligence and Attainment Tests. Danmarks Padagogiske Institute, Copenhagen.
  • Robinson, G. K. (1991). That BLUP is a good thing: The estimation of random effects. Statist. Sci. 6 15–51.
  • Rubin, D. B. (2005). Causal inference using potential outcomes: Design, modelling, decisions. J. Amer. Statist. Assoc. 88 9–25.
  • Rubin, D. B. (2006). Causal inference through potential outcomes and principal stratification: Application to studies with “censoring” due to death (with discussion). Statist. Sci. 21 299–312.
  • Rue, H., Martino, S. and Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations (with discussion). J. Roy. Statist. Soc. Ser. B 71 319–392.
  • Schall, R. (1991). Estimation in generalized linear models with random effects. Biometrika 78 719–727.
  • Shun, Z. (1997). Another look at the salamander mating data: A modified Laplace approximation approach. J. Amer. Statist. Assoc. 92 341–349.
  • Shun, Z. and McCullagh, P. (1995). Laplace approximation of high-dimensional integrals. J. Roy. Statist. Soc. Ser. B 57 749–760.
  • Skrondal, A. and Rabe-Hesketh, S. (2004). Generalized Latent Variable Modeling: Multilevel, Longitudinal, and Structural Equation Models. Chapman and Hall, London.
  • Skrondal, A. and Rabe-Hesketh, S. (2007). Latent variable modeling: A survey. Scand. J. Statist. 34 712–745.
  • Thomas, A., O’Hara, B., Ligges, U. and Staurtz, S. (2006). Making BUGS open. R News 6 12–16.
  • Vaida, F. and Meng X. L. (2004). Mixed linear models and the EM algorithm. In Applied Bayesian and Causal Inference from an Incomplete Data Perspective (A. Gelman and X. L. Meng, eds.). Wiley, New York.
  • Wilk, M. B. and Kempthorne, O. (1957). Standard errors of prediction in generalized linear mixed models. J. Amer. Statist. Assoc. 52 218–236.
  • Yun, S. and Lee, Y. (2006). Robust estimation in mixed linear models with non-monotone missingness. Statist. Med. 25 3877–3892.
  • Yun, S., Lee, Y. and Kenward, M. (2007). Using hierarchical likelihood for missing data problems. Biometrika 94 905–919.
  • Zeger, S. L., Liang, K. Y. and Albert, P. S. (1988). Models for longitudinal data: A generalized estimating equation approach. Biometrics 44 1049–1060.
  • Zhao, Y., Staudenmayer, J., Coull, B. A. and Wand, M. P. (2006). General design Bayesian generalized linear models. Statist. Sci. 21 35–51.