Awesome
This is a Github repository to accompany the paper:
Risso D, Purvis L, Fletcher R, Das D, Ngai J, Dudoit S, Purdom E (2018) "clusterExperiment and RSEC: A Bioconductor package and framework for clustering of single-cell and other large gene expression datasets" PLoS Comput Biol. 2018 Sep 4;14(9):e1006378 http://dx.plos.org/10.1371/journal.pcbi.1006378
How to Reproduce the analysis
From scratch
The provided make file will allow you to recreate the analysis in the paper from scratch. This code includes downloading the data from GEO.
You should first have R installed and the necessary packages installed: (BiocInstaller
,clusterExperiment
,benchmarkme
,and scone
)
).
Then type the following commands:
make OEAnalysis.Rout
Note that this is quite computationally intensive, and runs on parallel cores (by default the value defined by the environment variable SLURM_CPUS_PER_TASK
or 6 if such a value is missing). So it should only be run from scratch in a setting appropriate to this.
Using the provided intermediate results (quicker)
To have quicker access to the data and results, we also provide (via git lfs
) .rda
files that are the result of the output of downloading the data, and the RSEC
command (see https://git-lfs.github.com/ for instructions on git lfs
and how to download these files). They are saved under the directory dataOutput_submitted
.
To be able to run the above make command, and not have to rerun the computationally intensive commands, you must make the following changes:
-
Move (or copy) the directory
dataOutput_submitted
intodataOutput
-
Edit the file
OEAnalysis.R
so that the linerunClus<-FALSE
now reads as
runClus<-TRUE
After these changes you should be able to run the above make command,
make OEAnalysis.Rout
With these settings, the scripts should be able to run on a laptop relatively quickly.