Home

Awesome

Build Status

Synopsis

Get the readcounts at a locus by piping samtools mpileup output. The mpileup can contain one or several samples. This program has been tested on samtools v1.3.1

Install samtools

Compile mpileup2readcounts :

g++ -std=c++11 -O3 mpileup2readcounts.cc -o mpileup2readcounts

Usage

samtools mpileup -f ref.fa -l regions.bed BAM/*.bam | sed 's/		/	* 	*/g' | ./mpileup2readcounts 0 -5 false 3 0

Samtools arguments :

Four options for mpileup2readcounts :

Example output

chrlocrefdepthATCGatcgInsertionDeletiondepthATCGatcgInsertionDeletion
177572814C28032300020NANA800800000NANA
177572817C32202602020NANA800800000NANA
177579643C48009000390NA4:ccccagccctccaggt|2:CCCCAGCCCTCCAGGT900600030NANA

Line content

Common information for all samples:

For each sample :

Using GNU parallel tool

We first remove overlap in the bed using bedtools. Note that the header is removed here:

grep -v '^track' regions.bed | sort -k1,1 -k2,2n | bedtools merge -i stdin | awk '{print $1"\t"$2"\t"$3}' | sed 's/[[:space:]]/:/' | sed 's/[[:space:]]/-/' | parallel --keep-order "samtools mpileup -f ref.fa --region {} test.bam | sed 's/		/	*	*/g' | mpileup2readcounts 0 -5 false 0 | tail -n +2"