Home

Awesome

MEGAHIT

BioConda Install Downloads Build Status codecov

MEGAHIT is an ultra-fast and memory-efficient NGS assembler. It is optimized for metagenomes, but also works well on generic single genome assembly (small or mammalian size) and single-cell assembly.

Installation

Conda

conda install -c bioconda megahit

Guix

guix install megahit

Pre-built binaries for x86_64 Linux

wget https://github.com/voutcn/megahit/releases/download/v1.2.9/MEGAHIT-1.2.9-Linux-x86_64-static.tar.gz
tar zvxf MEGAHIT-1.2.9-Linux-x86_64-static.tar.gz
cd MEGAHIT-1.2.9-Linux-x86_64-static/bin/
./megahit --test  # run on a toy dataset
./megahit -1 MY_PE_READ_1.fq.gz -2 MY_PE_READ_2.fq.gz -o MY_OUTPUT_DIR

Pre-built docker image

# in the directory with the input reads
docker run -v $(pwd):/workspace -w /workspace --user $(id -u):$(id -g) vout/megahit \
  megahit -1 MY_PE_READ_1.fq.gz -2 MY_PE_READ_2.fq.gz -o MY_OUTPUT_DIR

Building from source

Prerequisites

git clone https://github.com/voutcn/megahit.git
cd megahit
git submodule update --init
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release  # add -DCMAKE_INSTALL_PREFIX=MY_PREFIX if needed
make -j4
make simple_test  # will test MEGAHIT with a toy dataset
# make install if needed

Usage

Basic usage

megahit -1 pe_1.fq -2 pe_2.fq -o out  # 1 paired-end library
megahit --12 interleaved.fq -o out # one paired & interleaved paired-end library
megahit -1 a1.fq,b1.fq,c1.fq -2 a2.fq,b2.fq,c2.fq -r se1.fq,se2.fq -o out # 3 paired-end libraries + 2 SE libraries
megahit_core contig2fastg 119 out/intermediate_contigs/k119.contig.fa > k119.fastg # get FASTG from the intermediate contigs of k=119

The contigs can be found final.contigs.fa in the output directory.

Advanced usage

To see the full manual, run megahit without parameters or with -h.

Also, our wiki may be helpful.

Examples

There is an example of a real assembly on this wiki page and its corresponding interactive Jupyter notebook. Contributed by Noah Lebovic (@lebovic).

Publications

License

This project is licensed under the GPLv3 License - see the LICENSE file for details