Home

Awesome

<img src="https://github.com/asncd/MIMOSCA/blob/master/common_files/mimosca_logo.png" title="MIMOSCA" alt="MIMOSCA" height=99 width=372>

Multiple Input Multiple Output Single Cell Analysis

Named after <a href="https://en.wikibooks.org/wiki/Control_Systems/MIMO_Systems">MIMO control systems</a>, and the associated methods in Systems Identification to use a combination of perturbations and measurements to help decode the system's dynamics, this package was designed to assist those attempting to understand biological dynamics by designing, performing, and analyzing perturbation scRNA-seq experiments.

Python Dependencies: pandas, numpy, matplotlib, seaborn, scipy, sklearn, statsmodels, <a href="https://github.com/facebook/fbpca">Facebook PCA</a> (though sklearn SVD or PCA should also work), networkx , and <a href="http://www.mapequation.org/code.html">Infomap</a> (this can be removed), and goatools

Related Resources

Contents

Please let me know (here or in the Google Forum) if there are any areas that you'd like to see improved or new items to be added!

FAQ

Design of Experiments & Power Calculations

<img src="https://github.com/asncd/MIMOSCA/blob/master/common_files/comp_knob.png" title="Experimental Design" alt="Experimental Design" height=360 width=432>

For a rough comparison of our pilot scRNA-seq to population RNA-seq of the same perturbation, see this <a href="https://github.com/asncd/MIMOSCA/blob/master/Power_Analysis_DOE/ost_ko_comparison.ipynb">iPython notebook</a>.

In designing Perturb-seq like experiments, there are a few key factors to keep in mind:

Signatures vs. individual transcript-level phenotypes

Are you interested in broad transcriptional signatures or individual gene level differential expression? If the former, a rough approximation may be around 10 cells/perturbation. If the later, 100 or more cells may be required based on the effect size.

A similar approximation for reads/cell would be a couple thousand for signatures and tens of thousands for gene-level.

Library Size and Representation

As in any pooled screen, the representation of each perturbation in the library will vary. With genome wide CRISPR libraries the difference between the 10th and 90th percentile of a library is roughly 6-fold (Wang, 2013). Depending on how much a user wants to ensure every member of the library is represented, the cells/perturbation factor should be multiplied by an additional factor to reflect this variance.

Using High MOI to infer genetic interactions

Our approach to use high MOI instead of either a single vector with multiple sgRNAs or vectors with different selection methods benefits from ease of implementation and the ability to represent a large diversity of combinations (only limited by the number of cells).

However, challenges include a Poisson-like variance in the number of sgRNA/cell, sgRNA detection sensitivity, and the formation of PCR chimeras during the enrichment PCR procedure that can create misassignments.

All three of these factors should be assessed in pilot experiments to troubleshoot. An example of such a pilot is shown below (modified from the Drop-seq style species mixing experiments):

<img src="https://github.com/asncd/MIMOSCA/blob/master/common_files/species_mix.png" title="Species Mix" alt="SMIX" height=431 width=365>

Guide Barcode, Cell Barcode Pairing

<img src="https://github.com/asncd/MIMOSCA/blob/master/common_files/perturbseq_readdistribution.png" title="pseq_plasmid" alt="pseq_plasmid">

The distribution of reads going to the Perturb-seq vector (antiparallel) from 10X RNA-seq is shown above. Note that while the expression of the construct is comprable to that of a housekeeping gene, only a fraction of the reads overlap with the 18bp barcode (colored section in the coverage track). As such, it is advisable in most cases when you have a short barcode to perform enrichment PCR to obtain sensitive GBC/CBC pairing.

Computational Workflow

<img src="https://github.com/asncd/MIMOSCA/blob/master/common_files/comp_flow2.png" title="Overview" alt="Overview" height=662 width=569>

Inputs

Intermediate Computation

Outputs