Dynamically Compressed Bayesian Hidden Markov Models using Haar Wavelets

J. Wiedenhoeft

Ph.D. Thesis, Rutgers, The State University of New Jersey, Oct 2018.

Hidden Markov models (HMM) have enjoyed a rich history of successes over the past decades. They have been applied to great effect in almost any conceivable segmentation task, from speech recognition and part-of-speech tagging, over financial time series analysis, to seismology and beyond. In bioinformatics, they are widely used for tasks such as gene finding, isochore classification and, most recently, detection of copy-number variation (CNV) in genomic data. Advances in biotechnology, such as high-resolution DNA microarrays and next-generation genome sequencing, have created data sets of millions and billions of values, presenting new challenges to the application of this classic. CNV detection from large genomic data sets is gaining momentum in research and diagnostics applications. As it often involves limited computational resources and time constraints, the importance of fast, accurate and low-memory approaches to HMM inference is obvious.

A reprint is available as PDF.

The publication includes results from the following projects or software tools: BayesianHMM, HaMMLET.

Further publications by John Wiedenhoeft.