Indel-tolerant read mapping of high-throughput sequencing reads
Invited Talk presented on May 25, 2012 by Alexander Schliep at CBRC, AIST, Tokyo, Japan.
Abstract: Abstract: Mapping high-throughput sequencing reads becomes increasingly costly as the edit distance between read and genome increases. If only few substitutions are present mapping is very fast. Things change in the presence of indels, as running times increase exponentially in the maximal permissible edit distance. We introduce TreQ, a read mapper based on the idea of geometric embedding. We reformulate the approximate string matching problem as finding nearest neighbors in a vector space which we solve using a cache-oblivious kd-tree.