Speeding Up Bayesian HMM by the Four Russians Method

M. Mahmud and A. Schliep

In Algorithms in Bioinformatics, Springer Berlin / Heidelberg, 6833, 188–200, 2011.

Bayesian computations with Hidden Markov Models (HMMs) are often avoided in practice. Instead, due to reduced running time, point estimates, maximum likelihood (ML) or maximum a posterior (MAP), are obtained and observation sequences are segmented based on the Viterbi path, even though the lack of accuracy and dependency on starting points of the local optimization are well known. We propose a method to speed-up Bayesian computations which addresses this problem for regular and time-dependent HMMs with discrete observations. In particular, we show that by exploiting sequence repetitions, using the four Russians method, and the conditional dependency structure, it is possible to achieve a (log T) speed-up, where T is the length of the observation sequence. Our experimental results on identi cation of segments of homogeneous nucleic acid composition, known as the DNA segmentation problem, show that the speed-up is also observed in practice. Availability: An implementation of our method will be available as part of the open source GHMM library from http://ghmm.org.

A reprint is available as PDF.

DOI: 10.1007/978-3-642-23038-7_17.

The publication includes results from the following projects or software tools: BayesianHMM.

Further publications by Alexander Schliep, Md P. Mahmud.