Home

Awesome

Hi-C EMT

Hi-C Extraction and Manipulation Toolkit

Tools for extracting data from .hic files to build new compact files that use the latest file version. Files can be subsampled. Regions from different files may also be stitched together.

Excise

Usage

excise [-r resolution] [-c chromosomes] [--seed random_seed] [--subsample num_contacts] 
       [--cleanup] [--only-intra] <file> <out_folder>

The required arguments are:

The optional arguments are:

Example

To subsample a map with a depth of ~5 million Hi-C contacts which goes down to 25kB resolution from GM12878_30.hic, use:

java -Xmx5g -jar hic_emt.jar excise -r 25000 --subsample 5000000 /Desktop/files/GM12878_30.hic gm_file_5M

To only subsample the first 3 chromosomes at this approximate depth (i.e. ~5 million contacts genomewide but ~600,000 contacts when only filtering for first 3 chromosomes.):

java -Xmx5g -jar hic_emt.jar excise -r 25000 -c 1,2,3 --subsample 5000000 /Desktop/files/GM12878_30.hic gm_file_5M

Stitch

Usage

stitch [-r resolution] [-k NONE/VC/VC_SQRT/KR/SCALE] [--reset-origin] [--cleanup]
       <file1,file2,...> <name1,name2,...> <chr1:x1:y1,chr2:x2:y2,...> <out_folder>

The required arguments are:

The optional arguments are:

Example

To grab a subset of KR normalized reads from chromosomes 1, 2, and 3 from three .hic files and put them in one file:

java -Xmx5g -jar hic_emt.jar stitch -r 25000 -k KR GM12878.hic,K562.hic,Hap1.hic GM,K562,Hap1 
                 1:100000000:110005000,2:115000000:125000000,3:80010000:90005000 results

Info

Usage

info <hic_file>

The required arguments are:

Example

To validate and print all the general information (genome, chromosomes, resolutions, normalizations, etc) about a specific .hic file:

java -Xmx5g -jar hic_emt.jar info /Desktop/files/GM12878_30.hic