Abstract
Hidden Markov models (HMMs) are an efficient tool to describe and model the underlying behaviour of many phenomena. HMMs assume that the observed data are generated independently from a parametric distribution, conditional on an unobserved process that satisfies the Markov property. The model selection or determining the number of hidden states for these models is an important issue which represents the main interest of this thesis. Applying likelihood-based criteria for HMMs is a challenging task as the likelihood function of these models is not available in a closed form. Using the data augmentation approach, we derive two forms of the likelihood function of a HMM in closed form, namely the observed and the conditional likelihoods. Subsequently, we develop several modified versions of the Akaike information criterion (AIC) and Bayesian information criterion (BIC) approximated under the Bayesian principle. We also develop several versions for the deviance information criterion (DIC). These proposed versions are based on the type of likelihood, i.e. conditional or observed likelihood, and also on whether the hidden states are dealt with as missing data or additional parameters in the model. This latter point is referred to as the concept of focus. Finally, we consider model selection from a predictive viewpoint. To this end, we develop the so-called widely applicable information criterion (WAIC). We assess the performance of these various proposed criteria via simulation studies and real-data applications. In this thesis, we apply Poisson HMMs to model the spatial dependence analysis in count data via an application to traffic safety crashes for three highways in the UK. The ultimate interest is in identifying highway segments which have distinctly higher crash rates. Selecting an optimal number of states is an important part of the interpretation. For this purpose, we employ model selection criteria to determine the optimal number of states. We also use several goodness-of-fit checks to assess the model fitted to the data. We implement an MCMC algorithm and check its convergence. We examine the sensitivity of the results to the prior specification, a potential problem given small sample sizes. The Poisson HMMs adopted can provide a different model for analysing spatial dependence on networks. It is possible to identify segments with a higher posterior probability of classification in a high risk state, a task that could prioritise management action.
Document Type
Thesis
Publication Date
2017
Recommended Citation
Kadhem, S. (2017) Model Fit Diagnostics for Hidden Markov Models. Thesis. University of Plymouth. Retrieved from https://pearl.plymouth.ac.uk/secam-theses/180