Home

Awesome

Split K-mer Analysis (version 2) <img src='ska_logo.png' align="right" height="100" />

<!-- badges: start -->

Cargo Build & Test docs.rs Clippy check codecov Crates.io GitHub release (latest SemVer)

<!-- badges: end -->

Description

This is a reimplementation of the SKA package in the rust language, by Johanna von Wachsmann, Simon Harris and John Lees. We are also grateful to have received user contributions from:

Split k-mer analysis (version 2) uses exact matching of split k-mer sequences to align closely related sequences, typically small haploid genomes such as bacteria and viruses.

SKA can only align SNPs further than the k-mer length apart, and does not use a gap penalty approach or give alignment scores. But the advantages are speed and flexibility, particularly the ability to run on a reference-free manner (i.e. including accessory genome variation) on both assemblies and reads.

Citation

Romain Derelle, Johanna von Wachsmann, Tommi Mäklin, Joel Hellewell, Timothy Russell, Ajit Lalvani, Leonid Chindelevitch, Nicholas J. Croucher, Simon R. Harris, John A. Lees (2024). Seamless, rapid, and accurate analyses of outbreak genomic data using split k-mer analysis. Genome Research, 34(10), 1661–1673.

https://genome.cshlp.org/content/34/10/1661.abstract

Documentation

Can be found at https://docs.rs/ska. We also have some tutorials available:

Installation

Choose from:

  1. Download a binary from the releases.
  2. Use cargo install ska or cargo add ska.
  3. Use conda install -c bioconda ska2 (note the two!).
  4. Build from source

For 2) or 4) you must have the rust toolchain installed.

OS X users

If you have an M1/M2 (arm64) Mac, we aren't currently automatically building binaries, so would recommend either option 2) or 4) for best performance.

If you get a message saying the binary isn't signed by Apple and can't be run, use the following command to bypass this:

xattr -d "com.apple.quarantine" ./ska

Build from source

  1. Clone the repository with git clone.
  2. Run cargo install --path . or RUSTFLAGS="-C target-cpu=native" cargo install --path . to optimise for your machine.

Differences from SKA1

Optimisations include:

And other improvements:

All of which make ska.rust run faster and with smaller file size and memory footprint than the original.

Planned features

Feature ideas (not definitely planned)

Things you can no longer do