Home

Awesome

aCNViewer

comprehensive genome-wide visualization of absolute copy number and copy neutral variations

Contact: Victor Renault / Alexandre How-Kit (aCNViewer@cephb.fr)

aCNViewer (Absolute CNV Viewer) is a tool which allows the visualization of absolute CNVs and cn-LOH across a group of cancer samples. aCNViewer proposes three graphical representations : dendrograms, bidimensional heatmaps allowing the visualization of chromosomal regions sharing similar abnormality patterns and quantitative stacked histograms facilitating the identification of recurrent absolute CNVs and cn-LOH. aCNViewer include a complete pipeline allowing the processing of raw data from SNP array (in tumor-only or paired tumor / normal mode) and whole exome/genome sequencing experiments (in paired tumor / normal mode only) using respectively ASCAT and Sequenza algorithms to generate absolute CNV and cn-LOH data used for the graphical outputs.

Table of contents


Installation

Docker installation

The easiest way to install aCNViewer is to install the Docker application (supports multi-threading but not computer clusters which are better suited for processing NGS bams): docker pull fjdceph/acnviewer

aCNViewer docker image requires about 20GB of space to install so if you run into an error while pulling the image locally, you probably need to change the location of docker images from /var/lib/docker/ to a location with more space and try again.

Installation from source

aCNViewer can also be installed from its source by:

  1. downloading aCNViewer's data (includes test data sets and most of the third-party softwares listed in the dependencies section)
  2. installing the dependencies listed below.
  3. downloading the github source code from this page: git clone https://github.com/FJD-CEPH/aCNViewer

Installation validation

Once aCNViewer is installed, you can run unit tests in order to check that everything is fine.

Dependencies:

Most of the dependencies (except R and python), along with test data sets, are packaged in the archive aCNViewer_DATA.tar.gz in aCNViewer_DATA/bin. You can find more details below:

Overview:

Overview of aCNViewer:


Tutorial

The results of all the examples below can be found in aCNViewer_DATA/allTests in their respective target folder. <a id="unitTests"></a>All examples of this tutorial are implemented as unit tests and can be run at once using: <a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -P testAll -t TARGET_DIR [--fastTest 0 --smallMem 0 --runGISTIC 0].

If --fastTest is set to 1, only tests which run in a reasonable amount of time will be run (all tests except Illumina SNP array, paired bams with Sequenza, GISTIC and Affymetrix SNP arrays from CEL files). If --runGistic is 1, GISTIC will be tested and if --smallMem is set 1, GISTIC will run in small memory mode and will only require about 10GB of RAM vs 50GB of RAM at the expense of a longer running time.

Glossary:

Let's call:

Requirements:

Download the test data set aCNViewer_DATA.tar.gz (~5GB and ~20GB uncompressed). In terms of computing resources: if you plan to:

Processing SNP array data

Affymetrix

TestAffyAscat

<a id="allPlots"></a>Generate all available plots from ASCAT segment files using base resolution for the quantitative histograms and using a window size of 2Mbp for the other plots:<br>

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f aCNViewer_DATA/snpArrays250k_sty/GSE9845_lrr_baf.segments.txt -t TEST_AFFY --refBuild hg18 -w 2000000 -b aCNViewer_DATA/bin --sampleFile aCNViewer_DATA/snpArrays250k_sty/GSE9845_clinical_info2.txt

quantitative stacked histogram example:

Histogram of heterozygous / homozygous CNVs:

Here are other typical plots you may be interested in:

<a id="customColors"></a><u>Customize colors:</u>

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f aCNViewer_DATA/snpArrays250k_sty/GSE9845_lrr_baf.segments.txt -t TEST_AFFY_RCOLOR --refBuild hg18 -w 2000000 -b aCNViewer_DATA/bin --sampleFile aCNViewer_DATA/snpArrays250k_sty/GSE9845_clinical_info2.txt --rColorFile aCNViewer_DATA/rColor.txt

<a id="gisticExample"></a><u>Quantitative histogram with GISTIC results:</u>

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f aCNViewer_DATA/snpArrays250k_sty/GSE9845_lrr_baf.segments.txt -t TEST_AFFY_GISTIC --refBuild hg18 -w 2000000 -b aCNViewer_DATA/bin --runGISTIC 1

<b>If you have trouble running this example</b> (in particular if your machine freezes or you get the message "Killed" in the "_gistic.txt.err" file), it may be due to a lack of resources in the machine you are using. In that case, please add the following option to the command above --smallMem 1 so that GISTIC runs in compressed memory mode. You can view the GISTIC results with significant broad events and significant focal events.

<a id="heatmapRel"></a><u>Heatmap of relative copy number values only for the clinical feature BCLC stage with the chromosome legend position set at 0,.55 i.e. at the left-most of the graph and at 55% on the y axis and the group legend position set at .9,1.05 (basically at the top right corner):</u>

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f aCNViewer_DATA/snpArrays250k_sty/GSE9845_lrr_baf.segments.txt -t TEST_AFFY_HEATMAP1 --refBuild hg18 -w 2000000 -b aCNViewer_DATA/bin --sampleFile aCNViewer_DATA/snpArrays250k_sty/GSE9845_clinical_info2.txt --plotAll 0 --heatmap 1 --dendrogram 0 -G "BCLC stage" --chrLegendPos 0,.55 --groupLegendPos .9,1.05 --useRelativeCopyNbForClustering 1

Heatmap of relative copy number values using the clinical feature BCLC stage:

<a id="heatmapGenPos"></a><u>Heatmap with regions ordered by genomic positions (only clustering on samples):</u>

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f aCNViewer_DATA/snpArrays250k_sty/GSE9845_lrr_baf.segments.txt -t TEST_AFFY_HEATMAP_GENPOS --refBuild hg18 -w 2000000 -b aCNViewer_DATA/bin --sampleFile aCNViewer_DATA/snpArrays250k_sty/GSE9845_clinical_info2.txt --plotAll 0 --heatmap 1 --dendrogram 0 -G "BCLC stage" --chrLegendPos 0,.55 --groupLegendPos .9,1.05 --useRelativeCopyNbForClustering 1 --keepGenomicPosForHistogram 1

Heatmap of relative copy number values with regions ordered by genomic positions using the clinical feature BCLC stage:

<a id="heatmapCNV"></a><u>Heatmap with copy number values:</u>

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f aCNViewer_DATA/snpArrays250k_sty/GSE9845_lrr_baf.segments.txt -t TEST_AFFY_HEATMAP2 --refBuild hg18 -w 2000000 -b aCNViewer_DATA/bin --sampleFile aCNViewer_DATA/snpArrays250k_sty/GSE9845_clinical_info2.txt --plotAll 0 --heatmap 1 --dendrogram 0 -G "BCLC stage" --chrLegendPos 0,.55 --groupLegendPos .9,1.05

Heatmap of copy number values using the clinical feature BCLC stage:

<a id="dendroExample"></a><u>Dendrogram with copy number values:</u>

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f aCNViewer_DATA/snpArrays250k_sty/GSE9845_lrr_baf.segments.txt -t TEST_AFFY_DENDRO --refBuild hg18 -w 2000000 -b aCNViewer_DATA/bin --sampleFile aCNViewer_DATA/snpArrays250k_sty/GSE9845_clinical_info2.txt --plotAll 0 --heatmap 0 --dendrogram 1 -G "BCLC stage" -u 1

Dendrogram of copy number values using the clinical feature BCLC stage:

<a id="outputFormatExamples"></a><u>Customize output formats:</u>

==Here is the full command:==

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f ASCAT_SEGMENT_FILE --refBuild REF_BUILD -b <a href="#binDir">BIN_DIR</a> [--histogram HISTOGRAM --lohToPlot LOH_TO_PLOT --useFullResolutionForHist USE_FULL_RESOLUTION_FOR_HIST] [-c CHR_SIZE_FILE -t OUTPUT_DIR -C CENTROMERE_FILE -w WINDOW_SIZE --sampleFile SAMPLE_FILE -G PHENOTYPIC_COLUMN_NAME --rColorFile RCOLOR_FILE --plotAll PLOT_ALL --outputFormat OUTPUT_FORMAT --ploidyFile PLOIDY_FILE --sampleToProcessList SAMPLE_TO_PROCESS_LIST --sampleToExcludeList SAMPLE_TO_EXCLUDE_LIST --sampleAliasFile SAMPLE_ALIAS_FILE] [--heatmap HEATMAP --labRow LAB_ROW --labCol LAB_COL --cexCol CEX_COL --cexRow CEX_ROW --height HEIGHT --width WIDTH --margins MARGINS --hclust HCLUST --groupLegendPos GROUP_LEGEND_POS --chrLegendPos CHR_LEGEND_POS --useRelativeCopyNbForClustering USE_RELATIVE_COPY_NB_FOR_CLUSTERING --keepGenomicPosForHistogram KEEP_GENOMIC_POS] [--dendrogram DENDROGRAM --useShape USE_SHAPE] [--runGISTIC RUN_GISTIC --geneGistic GENE_GISTIC --smallMem SMALL_MEM --broad BROAD --brLen BR_LEN --conf CONF --armPeel ARM_PEEL --saveGene SAVE_GENE --gcm GCM]<br> where:

<a id="generalPlotOptions"></a>The following options are general plotting options:

<a id="histogramOptions"></a>The following options are histogram specific:

<a id="gisticOptions"></a>The following options are GISTIC options (more details can be found here):

<a id="heatmapDendroOptions"></a>The following options are mainly specific to heatmaps while a few are related to dendrograms:

TestAffyCel

Generate a quantitative stacked histogram from CEL files (subset of data of hepatocellular carcinomas with hepatitis C virus etiology used in Chiang et al. Cancer Res, 2008) with a window size of 2Mbp:

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f aCNViewer_DATA/snpArrays250k_sty/ -t TEST_AFFY_CEL --refBuild hg18 -w 2000000 -b aCNViewer_DATA/bin --platform Affy250k_sty -l aCNViewer_DATA/snpArrays250k_sty/LibFiles/ <a href="#useCustomPloidies">[--useCustomPloidies USE_CUSTOM_PLOIDIES]</a>

If ASCAT is not installed (i.e you are not using the docker application) and if you want to install it into a custom R library folder, please add the following option to the previous command line: --rLibDir RLIB.

==Here is the full command:==

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f CEL_DIR --refBuild REF_BUILD -t OUTPUT_DIR -b <a href="#binDir">BIN_DIR</a> --platform AFFY_PLATFORM -l AFFY_LIB_DIR [--gw6Dir GW6_DIR] [--gcFile ASCAT_GC_FILE] [GENERAL_PLOT_OPTIONS] [HISTOGRAM_OPTIONS] [GISTIC_OPTIONS] [HEATMAP_DENDRO_OPTIONS]<br> where:

Illumina

TestIllu660k

Generate a quantitative stacked histogram from raw Illumina data from non-Hodgkin lymphoma patients used in Yang F et al. PLoS One 2014 with a window size of 2Mbp:

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f aCNViewer_DATA/snpArrayIllu660k/GSE47357_Matrix_signal_660w.txt.gz -t TEST_ILLU --refBuild hg19 -w 2000000 -b aCNViewer_DATA/bin --probeFile aCNViewer_DATA/snpArrayIllu660k/Human660W-Quad_v1_H_SNPlist.txt --platform Illumina660k --beadchip "human660w-quad"

==Here is the full command:==

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f ILLU_FILES --refBuild REF_BUILD -b <a href="#binDir">BIN_DIR</a> [--sampleList SAMPLE_TO_PROCESS_FILE] --probeFile PROBE_POS_FILE --platform ILLUMINA_PLATFORM [--beadchip BEADCHIP] [-g ASCAT_GC_FILE] [-N NORMALIZE] [GENERAL_PLOT_OPTIONS] [HISTOGRAM_OPTIONS] [GISTIC_OPTIONS] [HEATMAP_DENDRO_OPTIONS]<br> where:

NGS

Sequenza is used to process NGS paired (tumor / normal) bams and produce CNV segments. These segments are then used by aCNViewer to produce the different available outputs. This step is best executed on a computer cluster (supported clusters are <a id="supportedClusters">SGE, SLURM, MOAB and LSF</a>. Tests have been successfully made on SGE and SLURM clusters) but will work on a single machine as well (although it will be much slower).

testSequenzaRaw

Generate a quantitative histogram from paired (tumor / normal) bams:

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f aCNViewer_DATA/wes/bams/ -t TEST_WES_RAW --refBuild hg19 -w 2000000 -b aCNViewer_DATA/bin --fileType Sequenza --samplePairFile aCNViewer_DATA/wes/bams/sampleFile.txt <a href="#useCustomPloidies">[--useCustomPloidies USE_CUSTOM_PLOIDIES]</a>

==Here is the full command:==

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f BAM_DIR -t OUTPUT_DIR --refBuild REF_BUILD -b <a href="#binDir">BIN_DIR</a> --fileType Sequenza --samplePairFile SAMPLE_PAIR_FILE [-r REF_FILE] [--byChr 1] [-n NB_THREADS] [--createMpileUp CREATE_MPILEUP] [--pattern BAM_FILE_PATTERN] [-M MEMORY] [GENERAL_PLOT_OPTIONS] [HISTOGRAM_OPTIONS] [GISTIC_OPTIONS] [HEATMAP_DENDRO_OPTIONS]<br> where:

TestSequenzaCNVs

Generate quantitative stacked histogram from Sequenza results with a window size of 2Mbp:<br>

aCNViewer_DATA.tar.gz is required to run this example.

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f aCNViewer_DATA/wes/ -t TEST_WES_SEQUENZA --refBuild hg19 -w 2000000 -b aCNViewer_DATA/bin --fileType Sequenza

==Here is the full command:==

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f SEQUENZA_RES_DIR --fileType Sequenza -t TARGET_DIR --refBuild REF_BUILD -b <a href="#binDir">BIN_DIR</a> [GENERAL_PLOT_OPTIONS] [HISTOGRAM_OPTIONS] [GISTIC_OPTIONS] [HEATMAP_DENDRO_OPTIONS]<br> where:

Processing CNV files

At the moment, ASCAT segment file, PennCNV and Sequenza results can be used as an input to aCNViewer. It is possible however to feed aCNViewer with CNV results from any other softwares as explained in the section below.

Both examples below require to download aCNViewer_DATA.tar.gz.

Ascat file generated from Affymetrix SNP arrays

PennCNV

Generate quantitative stacked histogram from PennCNV results (79 samples from Hapmap3):<br>

<a href="#dockerOrPython">DOCKER_OR_PYTHON</a> -f aCNViewer_DATA/pennCNV/hapmap3.rawcnv -t TEST_PENN_CNV --refBuild hg18 -b aCNViewer_DATA/bin --lohToPlot none

Sequenza segments

OtherCNVformats

CNV results from any software can be processed by aCNViewer if formatted in the ASCAT segment file format i.e. a tab-delimited file with the following columns:

The result file should be sorted according to the following ordered column names: sample, chr, startpos, endpos and chromosome names in the chr column should not contain the prefix chr so chr1 should appear as 1. All CNVs for one indivual should be non overlapping. If there is only a global CNV value v (and this no allele-specific CNV value), nMajor and nMinor can take any value as long as nMajor + nMinor = v. When plotting the quantitative histogram, add option --lohToPlot none to disable LOH plotting.

OutputFiles

ASCAT

When processing raw SNP array data with aCNViewer, ASCAT is used to calculate CNV profiles. These results are saved into a folder named ASCAT in the user selected target directory with the following files:

FileDescription
.ASCATprofile.pnggenome-wide representation of ASCAT CNVs
.ASPCF.pngresults of segmentation using Allele-Specific Piecewise Constant Fitting
.rawprofile.pnggenome-wide representation of raw ASCAT CNVs
.sunrise.pngsunrise plot showing the optimal solution of tumor ploidy and percentage of aberrant tumor
.tumour.pngrepresentation of LogR and BAF values
tumorSep*.pngplot of BAF values
.ascatInfo.txtASCAT values of aberrantcellfraction, goodnessOfFit, psi and ploidy for all samples
.segments.txtlist of all CNVs with the copy number for each allele

GISTIC outputs

For the full list of GISTIC output files, please refer to the section Output Files of the following website. Here are the main output files of interest:

FileDescription
broad_significance_results.txtThe list of broad events with related q-values and frequencies
all_lesions.conf_*.txtthe list of all focal events along with their level of significance
amp_*list of all focal amplification events
del_*list of all focal deletion events

Sequenza

The Sequenza results of each sample pair are stored in a folder named TUMOR_NORMAL_sequenza in the sequenza folder and contains the following files:

FileDescription
*_segments.txtpredicted CNVs
*_CP_contours.pdf, *_confints_CP.txt & *_model_fit.pdfinferred cellularity and ploidy
*_alternative_fit.pdf & *_alternative_solutions.txtalternative inferred cellularities and ploidies
*_chromosome_view.pdfchromosome view with mutations, BAF, depth ratio and segments
*_genome_view.pdfgenome view of all the CNVs
*_mutations.txtlist of detected mutations
*_CN_bars.pdffrequency of all the copy number values

For more information about Sequenza output files, please refer to its user guide.

HistogramOutputs

When generating histograms, 3 text files with the suffix _samples.txt will be created along:

Each file is in the same format with the following columns:

The following files are created as well:

Dendrograms and heatmaps

2 folders (relCopyNb and rawCopyNb) will be created and will respectively contain graphs generated from relative copy number values and raw copy number values.

Limitations

aCNViewer has a few limitations including the fact that it does not currently account for intra-tumor heterogeneity. Indeed, having a simultaneous view on the copy number landscape along with the clonality status of these events could help better understand the mechanisms of a disease. Another current limitation of aCNViewer is the absence of a function to compare two groups of samples. One simple way to do that, though, would be to generate the quantitative histograms for both groups separately and compare these plots (as we did in Fig 2 of the article below).

Citation

aCNViewer: comprehensive genome-wide visualization of absolute copy number and copy neutral variations. Victor Renault, Jörg Tost, Fabien Pichon, Shu-Fang Wang-Renault, Eric Letouzé, Sandrine Imbeaud, Jessica Zucman-Rossi, Jean-François Deleuze & Alexandre How-Kit. PLoS One. 2017 Dec 19;12(12):e0189334. doi: 10.1371/journal.pone.0189334. eCollection 2017.