## SLIQ: Simple Linear Inequalities for Efficient Contig Scaffolding

**R.S. Roy, K. Chen, A. Sengupta and A. Schliep**

*
**Journal of Computational Biology* 2012, **19**, 1162–75.

Scaffolding is an important subproblem in 'de novo' genome assembly
in which mate pair data are used to construct a linear sequence of
contigs separated by gaps. Here we present SLIQ, a set of simple
linear inequalities derived from the geometry of contigs on the line that can
be used to predict the relative positions and orientations of contigs from individual
mate pair reads and thus produce a contig digraph.
The SLIQ inequalities can also filter out
unreliable mate pairs and can be used as a preprocessing step for any
scaffolding algorithm. We tested the SLIQ inequalities on five real data
sets ranging in complexity from simple bacterial genomes to complex mammalian
genomes and compared the results to the majority voting procedure used by many
other scaffolding algorithms. SLIQ predicted the relative positions and
orientations of the contigs with high accuracy in all cases and
gave more accurate position predictions than majority voting for
complex genomes, in particular the human genome.
Finally, we present a simple scaffolding
algorithm that produces linear scaffolds given a contig digraph. We show
that our algorithm is very efficient compared to other scaffolding algorithms
while maintaining high accuracy in predicting both contig positions and
orientations for real data sets.