Home

Awesome

Allele-Specific Copy Number Analysis of Tumors

Description

This repository provides the ASCAT R package (v3.2.0) that can be used to infer tumour purity, ploidy and allele-specific copy number profiles.

ASCAT is described in detail in: Allele-specific copy number analysis of tumors. Van Loo P et al. PNAS (2010).

This repository also contains the code underlying additional publication: Allele-specific multi-sample copy number segmentation. Ross EM, Haase K, Van Loo P & Markowetz F. Bioinformatics (2020).

Installation (v3.2.0 version)

Bioconductor package dependencies: GenomicRanges & IRanges (BiocManager::install(c('GenomicRanges','IRanges')) with a recent R/BiocManager version).

Processing high-throughput sequencing data: alleleCounter (C version)

Installing ASCAT using R: devtools::install_github('VanLoo-lab/ascat/ASCAT')

Changes since v2.5.3

Major changes:

Minor changes:

New features in v3:

Testing

We provide some scripts and input data in the ExampleData folder.

Reference files

All reference files are hosted on Zenodo.

Supported arrays without matched germline

Custom10k, IlluminaASA, IlluminaGSAv3, Illumina109k, IlluminaCytoSNP, IlluminaCytoSNP850k, Illumina610k, Illumina660k, Illumina700k, Illumina1M, Illumina2.5M, IlluminaOmni5, IlluminaOmniExpressExome, IlluminaGDACyto-8, Affy10k, Affy100k, Affy250k_sty, Affy250k_nsp, AffyOncoScan, AffyCytoScanHD, AffySNP6, HumanCNV370quad, HumanCore12, HumanCoreExome24 and HumanOmniExpress12.

Because arrays have a defined set of SNP probes, with a fairly constant rate of heterozygous probes across individuals, useful metrics in ascat.predictGermlineGenotypes can be inferred from some cases (with no or very few CN changes). However, sequencing data is subjected to massive variations because of design, coverage and/or artefacts. Therefore, we are not able to provide pre-defined metrics for unmatched sequencing data.

We now provide a preset for WGS data under specific conditions: hg38 assembly and >50x coverage. Although this preset might be used in some other conditions, please note that is has not been extensively benchmarked under other conditions so we do not provide any guarantee outside of the scope. Such a preset is called WGS_hg38_50X.

Misc

For more information about ASCAT and other projects of our group, please visit our website.

Changes to let ASCAT run on long-read data:

ascat.prepareHTS(
  tumourseqfile = tumour_BAM,
  normalseqfile = normal_BAM,
  tumourname = name_tumour,
  normalname = name_normal,
  allelecounter_exe = allelecounter,
  skip_allele_counting_normal = FALSE,
  skip_allele_counting_tumour = FALSE,
  alleles.prefix = G1000_alleles_hg38_chr,
  loci.prefix = G1000_loci_hg38_chr,
  gender = gender,
  genomeVersion = "hg38",
  nthreads = 12,
  tumourLogR_file = "Tumor_LogR.txt",
  tumourBAF_file = "Tumor_BAF.txt",
  loci_binsize = 500,
  min_base_qual= 10,
  additional_allelecounter_flags="-f 0")