Brazilian Journal of Probability and Statistics

Second-order autoregressive Hidden Markov Model

Daiane Aparecida Zuanetti and Luis Aparecido Milan

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

We propose an extension of Hidden Markov Model (HMM) to support second-order Markov dependence in the observable random process. We propose a Bayesian method to estimate the parameters of the model and the non-observable sequence of states. We compare and select the best model, including the dependence order and number of states, using model selection criteria like Bayes factor and deviance information criterion (DIC). We apply the procedure to several simulated datasets and verify the good performance of the estimation procedure. Tests with a real dataset show an improved fitting when compared with usual first order HMMs demonstrating the usefulness of the proposed model.

Article information

Source
Braz. J. Probab. Stat., Volume 31, Number 3 (2017), 653-665.

Dates
Received: February 2015
Accepted: June 2016
First available in Project Euclid: 22 August 2017

Permanent link to this document
https://projecteuclid.org/euclid.bjps/1503388833

Digital Object Identifier
doi:10.1214/16-BJPS328

Mathematical Reviews number (MathSciNet)
MR3693985

Zentralblatt MATH identifier
1377.62174

Keywords
Hidden Markov model second-order dependence Markov chain Monte Carlo (MCMC) gene modeling bacteriophage lambda genome

Citation

Zuanetti, Daiane Aparecida; Milan, Luis Aparecido. Second-order autoregressive Hidden Markov Model. Braz. J. Probab. Stat. 31 (2017), no. 3, 653--665. doi:10.1214/16-BJPS328. https://projecteuclid.org/euclid.bjps/1503388833


Export citation

References

  • Baum, L. E. and Petrie, T. (1966). Statistical inference for probabilistic functions of finite state Markov chains. Annals of Mathematical Statistics 37, 1554–1563.
  • Baum, L. E., Petrie, T., Soules, G. and Weiss, N. (1970). A maximization technique occuring in the statistical analysis of probabilistic functions of Markov chains. Annals of Mathematical Statistics 41, 164–171.
  • Biblio, M., Monfort, A. and Robert, C. P. (1999). Bayesian estimation of switching ARMA models. Journal of Econometrics 93, 229–255.
  • Boys, R. and Henderson, D. (2002). On determining the order of Markov dependence of an observed process governed by a hidden Markov model. Scientifc Programming 10, 241–251.
  • Boys, R. and Henderson, D. (2004). A Bayesian approach to DNA sequence segmentation. Biometrics 60, 573–588.
  • Boys, R., Henderson, D. and Wilkinson, D. (2000). Detecting homogeneous segments in DNA sequences by using hidden Markov models. Journal of the Royal Statistical Society: Series C (Applied Statistics) 49, 269–285.
  • Braun, J. V., Braun, R. K. and Muller, H.-G. (2000). Multiple changepoint fitting via quasilikelihood, with application to DNA sequence segmentation. Biometrika 87, 301–314.
  • Chib, S. (1996). Calculating posterior distributions and modal estimates in Markov mixture models. Journal of Econometrics 75, 79–97.
  • Churchill, G. (1989). Stochastic models for heterogeneous DNA sequences. Bulletin of Mathematical Biology 51, 79–94.
  • Churchill, G. (1992). Hidden Markov chains and the analysis of genome structure. Computers and Chemistry 16, 107–115.
  • da-Silva, C. Q. (2003). Hidden Markov models applied to a subsequence of the Xylella fastidiosa genome. Genetics and Molecular Biology 26, 529–535.
  • Djurić, P. M., Kotecha, J. H., Zhang, J., Huang, Y., Ghirmai, T., Bugallo, M. F. and Miguez, J. (2003). Particle filtering. Signal Processing Magazine, IEEE 20, 19–38.
  • Doucet, A., de Freitas, N. and Gordon, N. (2001). Sequential Monte Carlo Methods in Practice. Media: Springer.
  • Doucet, A. and Johansen, A. M. (2009). A tutorial on particle filtering and smoothing: Fifteen years later. Handbook of Nonlinear Filtering 12, 656–704.
  • du Preez, J. A. (1998). Efficient higher-order hidden Markov modeling. Ph.D. thesis, University of Stellenbosch. Available: www.ussigbase.org/downloads/jadp_phd.pdf.
  • Gassiat, E. and Kéribin, C. (2000). The likelihood ratio test for the number of components in a mixture with Markov regime. ESAIM. Probabilités Et Statistique 4, 25–52.
  • Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science 17, 457–472.
  • Gough, J., Karplus, K., Hughey, R. and Chothia, C. (2001). Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. Journal of Molecular Biology 313, 903–919.
  • Hadar, U. and Messer, H. (2009). High-order hidden Markov models—estimation and implementation. In Statistical Signal Processing. Cardiff, UK. doi:10.1109/SSP.2009.5278591.
  • Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrika 57, 357–384.
  • Krolzig, H.-M. (1997). Markov-Switching Vector Autoregressions. Lecture Notes in Economic and Mathematical Systems 454. New York: Springer.
  • Leea, S. Y., Leea, J. Y., Jungb, K. S. and Ryu, K. H. (2009). A 9-state hidden Markov model using protein secondary structure information for protein fold recognition. Computers in Biology and Medicine 39, 527–534.
  • Martino, L., Read, J., Elvira, V. and Louzada, F. (2015). Cooperative parallel particle filters for on-line model selection and applications to urban mobility. Available at viXra:1512.0420.
  • McLachlan, G. J. (1987). On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. Journal of the Royal Statistical Society. Series C 36, 318–324.
  • Muri, F. (1998). Modelling bacterial genomes using hidden Markov models. In COMPSTAT’98 Proceedings in Computational Statistics (R. W. Payne and P. J. Green, eds.) 89–100. Heidelberg: Physica.
  • Rabiner, L. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 257–285.
  • Ristic, B., Arulampalam, S. and Gordon, N. J. (2004). Beyond the Kalman filter: Particle filters for tracking applications. Artech house.
  • Robert, C. P., Celeux, G. and Diebolt, J. (1993). Bayesian estimation of hidden Markov chains: A stochastic implementation. Statist. Prob. Letters 16, 77–83.
  • Robert, C. P. and Titterington, D. M. (1998). Reparameterization strategies for hidden Markov models and Bayesian approaches to maximum likelihood estimation. Statistics and Computing 8, 145–158.
  • Rydén, T., Teräsvirta, T. and Asbrink, S. (1998). Stylized facts of daily return series and the hidden Markov model. Journal of Applied Econometrics 13, 217–244.
  • Schimert, J. (1992). A high order hidden Markov model. Ph.D. thesis, University of Washington.
  • Seifert, M. (2010). Extensions of Hidden Markov Models for the analysis of DNA microarray data. Ph.D. thesis, University of Halle-Wittenberg. Available at http://nbn-resolving.de/urn:nbn:de:gbv:3:4-4110.
  • Seifert, M., Abou-El-Ardat, K., Friedrich, B., Klink, B. and Deutsch, A. (2014). Autoregressive higher-order hidden Markov models: Exploiting local chromosomal dependencies in the analysis of tumor expression profiles. PLoS ONE 9, e100295.
  • Seifert, M., Gohr, A., Strickert, M. and Grosse, I. (2012). Parsimonious higher-order hidden Markov models for improved array-CGH analysis with applications to Arabidopsis thaliana. PLoS Computational Biology 8, e1002286.
  • Skalka, A., Burgi, E. and Hershey, A. D. (1968). Segmental distribution of nucleotides in the DNA of Bacteriophage lambda. Journal of Molecular Biology 34, 1–16.
  • Söding, J. (2005). Protein homology detection by HMM-HMM comparison. Bioinformatics 21, 951–960.
  • Spiegelhalter, D., et al. (2002). Bayesian measures of model complexity and fit. Royal Statistical Society 64, 583–639.