Awesome
The PhD repo
Qualifying Exam
Exam happened in 2019-04-17.
Dissertation
Uses aggiedown and GitHub Actions for CI. Tagged versions are available in the Releases page.
Experiments
smol gather
Comparison of containment approaches using MinHash:
- CMash (containment minhash)
- mash screen
- smol (scaled minhash)
Regenerating results (after running the setup steps):
conda activate thesis
cd experiments/smol_gather && snakemake --use-conda
Scaled MinHash sizes
Scaled MinHash sizes (number of hashes) analysis across domains in Genbank.
Inverted index and shared hashes
Analyzing unique and shared hashes in an inverted index.
Setup
All processing and analysis scripts were performed using the conda environment specified in environment.yml
.
To build and activate this environment run:
conda env create --force --file environment.yml
conda activate thesis