Home

Awesome

BranchedGP

BranchedGP is a package for building Branching Gaussian process models in python, using TensorFlow and GPFlow. You can install it via pip install BranchedGP.

The package contains two main models:

CI

BGP

Example

An example of what the model can provide is shown below.

  1. The posterior cell assignment is shown in top subpanel: each cell is assigned a probability of belonging to a branch.
  2. In the bottom subpanel the posterior branching time is shown: the probability of branching at a particular pseudotime. <img src="images/VAMP5_BGPAssignmentProbability.png" width="400" height="400" align="middle"/>

Quick start

For a quick introduction see the notebooks/Hematopoiesis.ipynb notebook. Therein we demonstrate how to fit the model and compute the log Bayes factor for two genes.

The Bayes factor in particular is calculated by calling CalculateBranchingEvidence after fitting the model using FitModel.

This notebook should take a total of 6 minutes to run.

File <br> nameDescription
HematopoiesisApplication of BGP to hematopoiesis data.
SyntheticDataApplication of BGP to synthetic data.
SamplingFromTheModelSampling from the BGP model.

Comparison to monocle-BEAM

In the paper we compare the BGP model to the BEAM method proposed in monocle 2. In monocle/runMonocle.R the R script for performing Monocle and BEAM on the hematopoiesis data is included.

List of python library files

File <br> nameDescription
FitBranchingModel.pyMain file for user to call BGP fit, see function FitModel
pZ_construction_singleBP.pyConstruct prior on assignments; use by variational code.
assigngp_dense.pyVariational inference code to infer function labels.
assigngp_denseSparse.pySparse inducing point variational inference code to infer function labels.
branch_kernParamGPflow.pyBranching kernels. Includes independent kernel as used in the overlapping mixture of GPs and a hardcoded branch kernel for testing.
BranchingTree.pyCode to generate branching tree.
VBHelperFunctions.pyPlotting code.

MBGP

MBGP is an extension of the BGP model, which addresses the shortcoming of BGP assigning observations to latent functions independently for each output dimension (gene). This leads to inconsistent assignments across outputs and reduces the accuracy of branching time inference. MBGP instead performs joint branch assignment inference across all output dimensions. This ensures that branch assignments are consistent and leverages more data for branching time inference.

Example

See below for an example model fit to synthetic noisy data representing 4 genes. <img src="notebooks/MBGP/synthetic-data-4-gene-fit.png" width="600" height="600" align="middle"/>

Quick start

For a quick introduction see the notebooks/MBGP/synthetic_noise_free.ipynb and notebooks/MBGP/experiments-figure-1-simple-fits.ipynb notebooks. Therein we demonstrate how to fit the model and visualise its fit.

A full list of key notebooks follows (ordered roughly according to how useful we expect them to be; higher is more useful).

File nameDescription
synthetic_noise_freeApplication of MBGP to synthetic noise-free data.
experiments-figure-1-simple-fitsApplication of MBGP to sythetic noisy data.
rediscover_early_branchingExploration of fitting MBGP and BGP to synthetic noisy data. Performs sanity checks, compares priors and computes inconsistent assignments by BGP. Takes a while to run.
rediscover_early_branching2Exploration of fitting MBGP and BGP to synthetic noisy data. Compares various priors and computes inconsistent assignments by BGP. Takes a while to run.
experiments-figure-2-correct-cell-histogramEvaluation of MBGP vs BGP label assignment to synthetic noisy data (no branching point learning). Strong prior. Takes a long time to run.
experiments-figure-3-bgp-label-inconsistencyEvaluation of MBGP vs BGP fits to synthetic noisy data (branching points are learned). Strong prior. Takes a long time to run.
new_experiments-figure-2-correct-cell-histogramAn alternative re-derivation of the experiments-figure-2-correct-cell-histogram.ipynb notebook.
new_experiments-figure-3-bgp-label-inconsistencyAn alternative re-derivation of the experiments-figure-3-bgp-label-inconsistency.ipynb notebook.
synthetic_Y_without_crossingExplores the generation of synthetic noisy data that avoids latent branches crossing after the initial branching point.

Development setup

Create a virtual environment, activate it and run make install.

Common tasks

Contributing

We welcome any and all contributions to the BranchedGP repo. Feel free to create issues or PRs into the repo and someone will take a look and review.

Notebooks

We use Jupytext to help version Jupyter notebooks. Each notebook corresponds to a Python script, which is easy to review. See also the Jupytext documentation on paired notebooks.

Note that Jupytext should be automatically installed in your virtual environment if you follow the instructions above.

Updating an existing notebook

We want our notebooks to always work. Therefore, before committing any changes to a notebook, we ask contributors to re-run the notebook from scratch.

The Jupytext extension should automatically sync the notebook to the paired script. If you're unsure, you can always check via make check_notebooks_synced and manually run make sync_notebooks if needed.

Adding a new notebook

Follow your usual procedure, but run make pair_notebooks afterwards. This will produce the paired script (or notebook if you're starting from a script). Commit both the notebook as well as the paired notebook.

Syncing notebooks

If Jupyter shows you a warning about the notebook being out of sync with the master script, run make sync_notebooks.

Formatting code

We automatically check that all contributions are formatted according to the recommendations by black and isort. If your changes fail these checks, all you need to do is run make format and commit the changes.

Static checks

We automatically check our code conforms to the coding standards enforced by flake8 and MyPy. You can check if your changes conform with these checks via make static_checks.