Awesome
distiller-nf
A modular Hi-C mapping pipeline for reproducible data analysis.
The distiller
pipeline aims to provide the following functionality:
- Align the sequences of Hi-C molecules to the reference genome
- Parse .sam alignment and form files with Hi-C pairs
- Filter PCR duplicates
- Aggregate pairs into binned matrices of Hi-C interactions
Installation
Requirements:
- java 8
- nextflow version 22.10.x or earlier
- singularity or docker (the latter should be able to run w/o root privileges, tutorial)
To setup a new project, execute the following line in the project folder:
$ nextflow clone open2c/distiller-nf ./
This will download the distiller pipeline and the configuration files.
Then:
- configure the location of the input files and other project details
in
project.yml
- configure additional parameters in
nextflow.config
- use provided hardware configurations using
local
andcluster
profiles, or provide your own usingcustom
profile
Launch distiller depending on your usage scenario:
- default hardware settings
./configs/local.config
with yourproject.yml
:
$ nextflow run distiller.nf -params-file project.yml
cluster
hardware profile./configs/cluster.config
with yourproject.yml
:
$ nextflow run distiller.nf -params-file project.yml -profile cluster
custom
hardware profile with your own configuration file and yourproject.yml
:
$ nextflow run distiller.nf -params-file project.yml -profile custom --custom_config /full/path/to/your.config
Test example
In a new project folder, execute:
$ nextflow clone mirnylab/distiller-nf ./
$ bash ./test/setup_test.sh
$ nextflow distiller.nf -params-file ./test/test_project.yml
Nextflow and DSL version note
Distiller was originally designed for DSL1 syntax of nextflow. Nextflow stopped supporting DSL1 with 23.X.X update.
We recommend fixing the nextflow version:
conda install -c bioconda "nextflow==22.10"
Distiller has been recently re-implemented in DSL2: https://github.com/open2c/distiller-nf/tree/distiller_dsl2.
DSL2 version is still the beta version under testing.