• Bernoulli
  • Volume 11, Number 1 (2005), 103-129.

Consistent and asymptotically normal parameter estimates for hidden Markov mixtures of Markov models

Pierre Vandekerkhove

Full-text: Open access


We introduce a new missing-data model, based on a mixture of K Markov processes, and consider the general problem of identifying its parameters. We point out in detail the main difficulties of statistical inference for such models: complete likelihood calculation, parametrization of the stationary distribution and identifiability. We propose a general tractable approach for estimating these models (admitting parametrization of the stationary distribution and identifiability) and check in detail that our assumptions are fully satisfied for a Markov mixture of two linear AR(1) models with Gaussian noise. Finally, a Monte Carlo method is proposed to calculate the split data likelihood of this model when no analytic expression for the invariant probability densities of the Markov processes is known.

Article information

Bernoulli, Volume 11, Number 1 (2005), 103-129.

First available in Project Euclid: 7 March 2005

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

hidden Markov chain incomplete data Markov chain mixture statistical inference


Vandekerkhove, Pierre. Consistent and asymptotically normal parameter estimates for hidden Markov mixtures of Markov models. Bernoulli 11 (2005), no. 1, 103--129. doi:10.3150/bj/1110228244.

Export citation


  • [1] Bakry, D., Milhaud, X. and Vandekerkhove, P. (1997) Statistique de chaînes de Markov cachées à espace d´états fini. Le cas non stationnaire. C. R. Acad. Sci. Paris Sér. I Math., 325, 203-206.
  • [2] Bar-Shalom, Y. and Li, X.R. (1993) Estimation and Tracking: Principles, Techniques, and Software. Norwood, MA: Artech House.
  • [3] Baraud, Y., Comte, F. and Viennet, G. (2001) Adaptive estimation in autoregression or beta-mixing regression via model selection. Ann. Statist., 29, 839-875.
  • [4] Baum, L.E. and Petrie, T. (1966) Statistical inference for probabilistic functions of finite state Markov chains. Ann. Math. Statist., 37, 1554-1563.
  • [5] Benesch, T. (2001) The Baum-Welch algorithm for parameter estimation of Gaussian autoregressive mixture models. J. Math. Sci. (New York), 105, 2515-2518.
  • [6] Bergey, G.K. and Franaszczuk, P.J. (2001) Epileptic seizures are characterized by changing signal complexity. Clinical Neurophysiology, 112, 241-249.
  • [7] Bickel, P.J., Ritov, Y. and Rydén, T. (1998) Asymptotic normality of the maximum likelihood estimator for general hidden Markov models. Ann. Statist., 26, 1614-1635.
  • [8] Billingsley, P. (1995) Probability and Measure, 3rd edition. Chichester: Wiley.
  • [9] Cai, J. (1994) A Markov unconditional variance in ARCH. J. Business Econom. Statist., 12, 309-316.
  • [10] Chan, K.S. and Tong, H. (1998) A note on testing for multi-modality with dependent data. Unpublished.
  • [11] Chauveau, D. and Vandekerkhove, P. (2001) An estimator of the entropy to control the stability of Markovian dynamical systems. Reprint.
  • [12] Chib, S., Kim, S. and Shepard, N. (1998) Stochastic volatility: Likelihood inference and comparison with ARCH models. Rev. Econom. Stud., 65, 361-394.
  • [13] Chung, S.H. and Gage, P.W. (1998) Signal processing techniques for channel current analysis based on hidden Markov models. Methods in Enzymology, 293, 420-438.
  • [14] Chung, S.H., Moore, J., Xia, L., Premkumar, L.S. and Gage, P.W. (1990) Characterization of single channel currents using digital signal processing techniques based on hidden Markov models. R. Soc. Lond. Philos. Trans. Ser. B, 329, 265-285.
  • [15] Dacunha-Castelle, D. and Duflo, M. (1993) Probabilités et Statistiques, 2. Problèmes à Temps Mobile. Paris: Masson.
  • [16] de Finetti, B. (1959) La probabilità e la statistica nei rapporti con l´induzione, secondo i diversi punti di vista. In Centro Internazionale Matematico Estivo, Induzione e Statistica, pp. 1-115. Rome: Istituto Matematico dell Università.
  • [17] Dégerine, S. and Zaïdi, A. (2002) Separation of an intantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach. Submitted to IEEE Trans. Signal Process.
  • [18] Diaconis, P. and Freedman, D. (1980) De Finettís theorem for Markov chains. Ann. Probab., 8, 115-130.
  • [19] Douc, R. and Matias, C. (2001) Asymptotics of the maximum likelihood estimator for general hidden Markov models. Bernoulli, 7, 381-420.
  • [20] Douc, R., Moulines, E. and Rydén, T. (2004) Asymptotic properties of the maximum likelihood estimator in autoregressive models with Markov regime. Ann. Statist., 32(5).
  • [21] Doukhan, P. (1994) Mixing: Properties and Examples, Lecture Notes in Statist. 85. New York: Springer-Verlag.
  • [22] Fortini, S., Ladelli, L., Petris, G. and Regazzini, E. (2002) On mixtures of distributions of Markov chains. Stochastic Process Appl., 100, 147-165.
  • [23] Franaszczuk, P.J. and Bergey, G.K. (1999) An autoregressive method for the measurement of synchronization of interictal and ictal EEG signals. Biol. Cybernet., 81, 3-9.
  • [24] Francq, C. and Roussignol, M. (1998) Ergodicity of autoregressive processes with Markov-switching and consistency of the maximum likelihood estimator. Statistics, 32, 151-173.
  • [25] Fredkin, D.R. and Rice, J.A. (1987) Correlation functions of a function of finite-state Markov process with application to chanel kinetics. Math. Biosci., 87, 161-172.
  • [26] Fredkin, D.R. and Rice, J.A. (1992) Maximum likelihood estimation and identification directly from single-channel recordings. Proc. Roy. Soc. Lond. Ser. B, 249, 125-132.
  • [27] Freedman, D. (1962) Mixture of Markov processes. Ann. Math. Statist., 33, 114-118.
  • [28] Garcia, R. and Perron, P. (1996) An analysis of the real interest rate under regime shift. Rev. Econom. Statist.
  • [29] Hamilton, J.D. (1989) A new approach to the economic analysis of non-stationary time series and the business cycle. Econometrica, 57, 357-384.
  • [30] Hamilton, J.D. and Susmel, R. (1994) Autoregressive conditional heteroskedasticity and changes in regime. J. Econometrics, 64, 307-333.
  • [31] Iasemidis, L.D. and Sackellares, J.C. (1991) The evolution with time of the spatial distribution of the largest Lyapounov exponent on the human epileptic cortex. In D. Duke and W. Pritchard (eds), Measuring Chaos in the Human Brain. Singapore: World Scientific.
  • [32] Jalali, A. and Pemberton, J. (1995) Mixture models for time series. J. Appl. Probab., 32, 123-138.
  • [33] Ji, C., Snapp, R. and Psaltis, D. (1990) Generalizing smoothness constraints from discrete samples. Neural Comput., 2(2), 188-197.
  • [34] Juang, B.H. and Rabiner, L.R. (1991) Hidden Markov models for speech recognition. Technometrics, 33, 251-272.
  • [35] Krishnamurthy, V. and Rydén, T. (1998) Consistent estimation of linear and non-linear autoregressive models with Markov regime. J. Times Ser. Anal., 19, 291-307.
  • [36] LeGland, F. and Mevel, L. (2000) Exponential forgetting and geometric ergodicity in Hidden Markov Models. Math. Control Signals Systems, 13(1), 63-93.
  • [37] Leroux, B.G. (1992) Maximum likelihood estimation for Hidden Markov models. Stochastic Process. Appl., 20, 545-558.
  • [38] Neuts, M.F. (1994) Matrix-Geometric Solutions in Stochastic Models. An Algorithmic Approach. Baltimore, MD: Johns Hopkins University Press.
  • [39] Novák, D., Lhotská, L., Eck, V. and Sorf, M. (2001) EEG and VEP signal processing. Preprint, Czech Technical University.
  • [40] Pham, D.-T. and Garat, P. (1997) Blind separation of mixture of independent sources through a quasimaximum likelihood approach, IEEE Trans. Signal Process., 4, 1712-1725.
  • [41] Quin, F., Auerbach, A. and Sachs, F. (2000a) A direct optimization approach to hidden Markov modelling for single channel kinetics. Biophys. J., 79, 1915-1927.
  • [42] Quin, F., Auerbach, A. and Sachs, F. (2000b) Hidden Markov modelling for single channel kinetics with filtering and correlated noise. Biophys. J., 79, 1928-1944.
  • [43] Rabiner, L.R. (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE, 77, 257-284.
  • [44] Rydén, T. (1994) Consistent and asymptotically normal parameter estimates for hidden Markov models. Ann. Statist., 22, 1884-1895.
  • [45] Sackellares, J.C., Iasemidis, L.D., Shiau, D.-S., Gilmore, R. and Roper, S.N. (2000) Epilepsy - when chaos fails. In K. Lehnertz, J. Arnhold, P. Grassberger and C.E. Elger (eds), Chaos in the Brain? Singapore: World Scientific.
  • [46] Stout, W. (1974) Almost Sure Convergence. New York: Academic Press.
  • [47] Teicher, H. (1963) Identifiability of finite mixture. Ann. Math. Statist., 34, 1265-1269.
  • [48] Tong, H. (1990) Non-linear Time Series. New York: Oxford University Press.
  • [49] Tugnait, J.K. (1982) Detection and estimation for abruptly changing systems. Automatica, 18, 607-615.
  • [50] Venkataramanan, L. and Sigworth, F.J. (2002) Applying hidden Markov models to the analysis of single ion channel activity. Biophys. J., 82, 1930-1942.
  • [51] Wong, C.S. and Li, W.K. (2000) On a mixture autoregressive model. J. R. Statist. Soc., Ser. B, 62, 95-115.
  • [52] Wong, C.S. and Li, W.K. (2001) On a logistic mixture autoregressive model. Biometrika, 88, 833-846.
  • [53] Zhengyan, L. and Chuanrong, L. (1996) Limit Theory for Mixing Dependent Random Variables. Dordrecht: Kluwer Academic Publishers.