HomologyClassification: Detecting remote homologs as a classification problem
Detecting whether two proteins are homologs is one of the fundamental problems in bioinformatics. Classically, their sequence similarity is measured with a sequence alignment score and a decision about homology is made using score statistics. How well one can solve this classification problem is strongly influenced by the assumptions necessary for the statistics to hold. We use an approach based on Support Vector Machines to address this problem.