Awesome
CellProfiler on Terra
WDL workflows and scripts for running a CellProfiler pipeline on Google Cloud hardware. Includes workflows for all steps of a full Cell Painting pipeline.
Works well in Terra, and will also work on any Cromwell server that can run WDLs. Currently specific to a Google Cloud backend. (We are open to supporting more backends, specifically cloud storage locations, in the future, including AWS and Azure.)
You can see these workflows in action and try them yourself in Terra workspace cellpainting!
Three pipelines:
-
- All the workflows necessary to run an end-to-end Cell Painting pipeline, starting with raw images and ending with extracted features, both in database format and aggregated as CSV files.
- Appropriate for datasets of arbitrary size.
- Scatters the time-consuming analysis steps over many VMs in parallel. By default, a dataset is split into individual wells, and each well is run on a separate VM.
-
- Run the
cytominer-database
ingest step to create a SQLite database containing all the extracted features. - Run the aggregation step from
pycytominer
to create CSV files.
- Run the
-
CellProfiler (distributed or single VM)
- A single WDL workflow that runs a
CellProfiler
.cppipe
pipeline on a dataset.
- A single WDL workflow that runs a
CellProfiler
How to run these workflows yourself
These workflows are all publicly available, and hosted in Dockstore. From there, you can import and run the workflows in Terra or any other place you like to run WDL workflows.
You can clone the Terra workspace cellpainting, which is conveniently preconfigured to run on three plates of sample data, if you just want to give it a try.