Computer Science and Engineering
 Gothenburg University | Chalmers

Home page

Home page  Contact us  Site map 

 

 

 

 

BayesianHMM: Fast MCMC Sampling for Hidden Markov Models to Determine Copy Number Variations

Hidden Markov Models are often used for analyzing Comparative Genomic Hybridization (CGH) data to identify chromosomal aberrations or copy number variations by segmenting observation sequences. For efficiency reasons often parameters of an HMM are estimated with maximum likelihood and a segmentation is obtained with the Viterbi algorithm. This introduces considerable uncertainty in the segmentation, which can be avoided with Bayesian approaches using Markov Chain Monte Carlo (MCMC) sampling. While their advantages have been clearly demonstrated, the likelihood based approaches are preferred in practice for their lower running times; datasets coming from high-density arrays and next generation sequencing amplify these problems.

We propose an approximate sampling technique inspired by discrete sequence compression for HMM and kd-trees to leverage spatial relations between data points in typical data sets to speed up the MCMC sampling.

Publications

Wiedenhoeft, John and Brugel, Eric and Schliep, Alexander. Fast Bayesian Inference of Copy Number Variants using Hidden Markov Models with Wavelet Compression (2016) [details]

Wiedenhoeft, John and Brugel, Eric and Schliep, Alexander. Fast Bayesian Inference of Copy Number Variants using Hidden Markov Models with Wavelet Compression (2016) [details]

Mahmud, Md. Reduced representations for efficient analysis of genomic data; from microarray to high throughput sequencing (2014) [details]

Mahmud, Md and Schliep, Alexander. Speeding Up Bayesian HMM by the Four Russians Method (2011) [details]

Mahmud, Md and Schliep, Alexander. Fast MCMC Sampling for Hidden Markov Models to Determine Copy Number Variations (2011) [details]

Contact: Md P. Mahmud (pavelm@cs.rutgers.edu).