MVQueries: Classifying short gene expression time-courses

What is MVQueries

We represent short gene expression time-courses monitoring response to toxins by piecewise constant functions, which are modeled as left–right Hidden Markov Models. Our software implements a Bayesian approach to parameter estimation and in inference which helps to cope with the short, but highly multivariate time-courses. Compared to previously published work, we improve prediction accuracy by 7 and 4%, respectively, when classifying toxicology and stress response data. We also reduce running times by at least a factor of 140. The software is implemented in Python package and is icensed under the terms of the GNU GENERAL PUBLIC LICENSE (GPL) Version 3, or later.

The Python source code and the data files can be downloaded here. The main programs are nn_rank.py and train_rank_highdim.py. Both are called with a single argument which is the path to a config file. An example config file is supplied.

To call the One-Nearest-Neighbor classifier type:

python nn_rank.py example.cfg
For the classifier using linear HMMs:
python train_rank_highdim.py example.cfg

For further information contact Alexander Schliep (alexander@schlieplab.org).

Team

Members: Alexander Schliep, Alexander Schliep, Ivan G Costa, Christoph Hafemeister. Collaborators: Alexander Schönhuth (Centrum Wiskunde & Informatica).