Home

Awesome

GrandPrix

GrandPrix is a package for non-linear probabilistic dimension reduction algorithm in python, using TensorFlow and GPFlow. GrandPrix uses sparse variational approximation to project data to lower dimensional spaces. The model is described in the paper

"GrandPrix: Scaling up the Bayesian GPLVM for single-cell data.", Sumon Ahmed, Magnus Rattray and Alexis Boukouvalas, Bioinformatics, Volume 35, Issue 1, 01 January 2019, Pages 47–54.

To replicate the results in the paper please use the betaVersion branch. The master branch works with the latest version of GPflow.

N.B. The package contains several large data files which are needed to run the example notebooks. Please be sure that your system has Git Large File Storage (Git LFS) installed to download these large data files.

Installation

If you have any problems with installation see the script at the bottom of the page for a detailed setup guide from a new python environment.

pip install tensorflow
git clone https://github.com/GPflow/GPflow.git
cd GPflow    
pip install .
cd

See GPFlow page for more detailed instructions.

git clone https://github.com/ManchesterBioinference/GrandPrix
cd GrandPrix
python setup.py install
cd
<!-- ## Documentation The online documentation for GrandPrix is available here: - [Online documentation](./docs/_build/html/index.html) -->

List of notebooks

To run the notebooks

cd GrandPrix/notebooks
jupyter notebook
File <br> nameDescription
<a href="./notebooks/Windram.ipynb" target="_blank">Windram</a>Application of GrandPrix to microarray data, models with and without informative prior.
McDavidApplication of GrandPrix to cell cycle data.
ShalekApplication of GrandPrix to single-cell RNA_seq from mouse dentritic cells.
Droplet_DPTApplication of GrandPrix to droplet based single-cell RNA_seq data.
Droplet_68KApplication of GrandPrix to ~68k PBMCs, models optimising and fixing inducing variables.
GuoApplication of extendend 2-D GrandPrix model to embryonic stem cells.
Analysing_posterior_varianceCompare posterior distributions from GrandPrix with other models.
<!-- | Zheng| Sampling from the BGP model. | -->

Running in a cluster

When running GrandPrix in a cluster it may be useful to constrain the number of cores used. To do this insert this code at the beginning of your script.

from gpflow import settings
settings.session.intra_op_parallelism_threads = NUMCORES
settings.session.inter_op_parallelism_threads = NUMCORES

Installing with a new environment

conda create -n newEnv python=3.5
source activate newEnv
mkdir newInstall
cd newInstall