Home

Awesome

Sourmash

Compute MinHash signatures for DNA sequences.

Quick Start

To execute the pipeline on your computer, first pull the docker image

docker pull hadrieng/sourmash

Then execute the workflow

nextflow run sourmash.nf --reads data/\*.fastq

It will produce a directory containing a clustering & dendrogram of all the fastq files present in your data directory, as well as a similarity matrix and heatmap.

Pipeline parameters

--reads

--adapt

Profiles

The SGBC cluster uses a module system. Pulling the docker image is not required!

By default, the pipeline runs locally using docker. If you run the nonpareil pipeline on the SGBC cluster, please pass the option -profile planet

Example:

nextflow run sourmash.nf -profile planet --reads /proj/my_proj/data/\*.fastq --adapt custom_adapters.fasta

Citations