Context-specific Independence mixture Modelling for Protein Families

Contributed Talk presented on Sept. 18, 2007 by Benjamin Georgi at European Conference on Machine Learning (ECML),Warsaw, Poland.

Abstract: Protein families can be divided into subgroups with functional differences. The analysis of these subgroups and the determination of which residues convey substrate specificity is a central question in the study of these families. We present a clustering procedure using the \emph{context-specific independence} mixture framework using a Dirichlet mixture prior for simultaneous inference of subgroups and prediction of specificity determining residues based on multiple sequence alignments of protein families. Application of the method on several well studied families revealed a good clustering performance and ample biological support for the predicted positions. The software we developed to carry out this analysis PyMix - the Python mixture package is available from http://www.algorithmics.molgen.mpg.de/pymix.html

Download PDF of Contributed Talk.