Home

Awesome

Open Global Infrastructure Risk/Resilience Analysis

mdBook Documentation pyTest snakemake workflow

Introduction

This open-source snakemake workflow can be used to analyse environmental risks to infrastructure networks using global open data. It is a work in progress.

Goals:

Non-goals:

Installation

Install open-gira by cloning the repository:

git clone https://github.com/nismod/open-gira.git

The repository comes with a environment.yml file describing the conda and PyPI packages required to run open-gira. The open-gira developers recommend using either micromamba or mamba to install and manage these conda packages.

Having installed one of the suggested package managers, to create the open-gira conda environment:

micromamba create -f environment.yml -y

And to activate the environment:

micromamba activate open-gira

Utilities

Some rules use the wget utility to download files.

On Linux or MacOS, you may already have the wget utility available. If not, it should be possible to install with your usual package manager (e.g. apt, MacPorts, brew), or else using micromamba:

micromamba install wget

On Windows, you may have it already if you have a MinGW or Cygwin installation. If not, you can access binaries at eternallybored.org. Download the standalone exe and place it for example in C:\Users\username\bin or somewhere on your PATH.

exactextract is used for zonal statistics in the tropical cyclones / electricity grid analysis. It is not available via the conda package management ecosystem and so must be installed separately. Please see installation instructions here.

You are now ready to request result files, triggering analysis jobs in the process.

Note that all subsequent commands given in the documentation assume that the open-gira environment is already activated.

Tests

Workflow steps are tested using small sample datasets.

To run the tests:

python -m pytest tests

Usage

open-gira is comprised of a set of snakemake rules which call scripts and library code to request data, process it and produce results.

The key idea of snakemake is similar to make in that the workflow is determined from the end (the files users want) to the beginning (the files users have, if any) by applying general rules with pattern matching on file and folder names.

A example invocation looks like:

snakemake --cores 2 -- results/wales-latest_filter-road-primary/edges.gpq

Here, we ask snakemake to use up to 2 CPUs to produce a target file, in this case, the edges of the Welsh road network. snakemake pattern matches wales-latest as the OSM dataset name and road-primary as the network type we want to filter for, picking up the filter expressions as defined in config/osm_filters/road-primary.txt.

To check what work we're going to request before commencing, use the -n flag:

snakemake -n --cores 2 -- results/wales-latest_filter-road-primary/edges.gpq

This will explain which rules will be required to run to produce the target file. It may be helpful to visualise which rules are expected to run, too.

The workflow configuration details are in config/config.yml. You can edit this to set the target OSM infrastructure datasets, number of spatial slices, and hazard datasets.

See the documentation and config/README.md for more details on usage in general and on configuration.

Documentation

Documentation is written using the mdbook format, using markdown files in the ./docs directory.

Follow the installation instructions to get the mdbook command-line tool.

To build the docs locally:

cd docs
mdbook build
open book/index.html

Or run mdbook serve to run a server and rebuild the docs as you make changes.

Related projects

Two libraries have been developed in tandem with open-gira and provide some key functionality.

snail

The open-source Python library snail is used for vector-raster intersection, e.g. identifying which road segments might be affected by a set of flood map hazard rasters.

snkit

The snkit library is used for network cleaning and assembly.

Acknowledgments

This research received funding from the FCDO Climate Compatible Growth Programme. The views expressed here do not necessarily reflect the UK government's official policies.

This research has also been supported by funding from the World Bank Group, and the UK Natural Environment Research Council (NERC) through the UK Centre for Greening Finance and Investment (CGFI).