Home

Awesome

DECENT

Differential Expression with Capture Efficiency adjustmeNT for single-cell RNA-seq data

Citation

Ye, C., Speed, T. P., Salim, A. DECENT: Differential Expression with Capture Efficiency AdjustmeNTfor Single-Cell RNA-seq Data.Bioinformatics, 35(24), 5155-5162. 2019. https://doi.org/10.1093/bioinformatics/btz453

News

Jun 21, 2019

Jun 5, 2019

Feb 16, 2019

Aug 15, 2018

Feb 6, 2018

Jan 30, 2018

Jan 15, 2018

Installation

You can install DECENT from github with:

require(devtools)
devtools::install_github("cz-ye/DECENT")

Quick start

Here we use a simulated dataset for demonstration

data("sim")


# DECENT with spike-ins
de.table <- decent(data.obs = sim$data.obs, # UMI count matrix after quality control
                                            # at least > 3% non-zero counts for each cell and > 5 non-zero counts for each gene
                   X = ~as.factor(sim$cell.type), # cell type/group indicator
                   use.spikes = T, 
                   spikes = sim$sp.obs, # observed UMI count
                   spike.conc = sim$sp.true, # nominal molecule count
                   s.imputed = T, # get single imputation expression matrix
                   E.imputed = T, # get mean imputation expression matrix
                   dir = './' # directory to save the fitted models and imputed data matrices.
                   )

# DECENT without spike-ins
de.table <- decent(data.obs = sim$data.obs,
                   X = ~as.factor(sim$cell.type), 
                   use.spikes = F,
                   CE.range = c(0.02, 0.1) # specify the range of the ranked random capture efficiency
                   )

# DECENT with batch dummy variable
batch <- rep(1, length(sim$cell.type))
set.seed(0)
batch[sample.int(length(sim$cell.type), length(sim$cell.type)/2)] <- 2 # randomly split into 2 batches just for demonstration
de.table <- decent(data.obs = sim$data.obs, 
                   X = ~as.factor(sim$cell.type), 
                   W = ~as.factor(batch),
                   use.spikes = T,
                   spikes = sim$sp.obs, spike.conc = sim$sp.true)
                   
# Ground truth can be found in the DE.gene vector.

The output object of DE model, no-DE model and LRT will be saved in the working directory (dir argument) as decent.DE.rds, decent.noDE.rds and decent.lrt.rds. A data frame containing the DE results is returned by the function.

Note that the LRT step also involves optimization of parameters and is currently the bottleneck step.

The function is by default run in parallel using all cores. Specify the number of cores to use by changing the argument n.cores. Use option parallel = F to run on single core.

By default, cell size factors are estimated using MLE. In some cases, TMM (setting normalize = 'TMM') gives more accurate estimates.