Home

Awesome

Mapping of anti-flu serum against the Perth/2009 H3 HA

Mutational antigenic profiling of Perth/2009 H3 HA codon mutant libraries against ferret and human sera.

This is the computer code and raw data for the study Mapping person-to-person variation in viral mutations that escape polyclonal serum targeting influenza hemagglutinin, eLife, 2019.

Study led by Juhye Lee and Jesse Bloom.

Quick summary

Running the analysis

Automated steps

The main analysis is performed primarily by a series of Jupyter notebooks and Python scripts:

  1. analyze_map.ipynb: analyzes mutational antigenic profiling

  2. analyze_neut.ipynb: analyzes neutralization assays

  3. analyze_natseqs.ipynb: analyzes changes in amino-acid frequencies among natural sequences

  4. parameterize_map_on_struct.py: parameterizes the template Jupyter notebook map_on_struct_template.ipynb to show structures for each type of sera.

To run the three steps above, execute the bash script run.bash with:

./run.bash

On the Hutch cluster, you can also submit this script using slurm with:

sbatch -p largenode -c 16 --mem=100000 run.bash

Manual steps

The following steps to must be performed manually to finalize the paper figures:

  1. The automated steps above create Jupyter notebooks that map the immune selection onto the structure using dms_struct (which is a wrapper around nglview). These notebooks are in results/notebooks with names matching map_on_struct_*.ipynb. To open them interactively with mybinder, click here. You can also directly open each notebook as an interactive app in appmode by clicking on the links in the Quick summary section at the top of this README. To generate static protein structure images for the final figures, you also need to run each notebook locally and interactively cell-by-cell (giving time for each structure to render).

  2. The Jupyter notebook make_final_figs.ipynb generates the final figures for the paper, which are placed in .results/figures/final. You need to run this notebook to generate the figures.

Configuring the analysis

The configuration for the analysis is in a separate file, config.yaml. This file defines key variables for the analysis, and should be self-explanatory. The config.yaml file points to several files in the ./data/ subdirectory that specify essential data for the analysis:

Results

Results are placed in the ./results/ subdirectory. Many of the results files are not tracked in this GitHub repo since they are very large. However, the following results are tracked:

Other subdirectories

Other subdirectories in the repo are: