Home

Awesome

gatk4-HaplotypeCaller-nf

GATK4 HaplotypeCaller step, in gVCF mode, first step for subsequent whole cohort Joint Genotyping, following in GATK Best Practices (step Call Variants Per-Sample).

<img src="https://us.v-cdn.net/5019796/uploads/editor/mz/tzm69d8e2spl.png" width="800" />

Description

Small pipeline to call recalibrated BAM, on a per sample basis, and store the gVCF. This pipeline will take advantage of a scatter-gather strategy. A subsequent pipeline will perform the full cohort calling with all the gVCF files.

Dependencies

  1. This pipeline is based on nextflow. As we have several nextflow pipelines, we have centralized the common information in the IARC-nf repository. Please read it carefully as it contains essential information for the installation, basic usage and configuration of nextflow and our pipelines.
  2. GATK4 executables
  3. Picard Tools

Input

A nextflow.config is also included, modify for suitability outside our pre-configured clusters (see Nexflow configuration).

Usage for Cobalt cluster

nextflow run iarcbioinfo/gatk4-HaplotypeCaller.nf -profile cobalt --input "/data/test_*.bam" --output_dir myGVCFs --ref_fasta /ref/Homo_sapiens_assembly38.fasta --gatk_exec /bin/gatk-4.0.4.0/gatk --interval_list target.list