Home

Awesome

MMFlood: A Multimodal Dataset for Flood Delineation from Satellite Imagery.

Code and data access for the MMFlood dataset.

Last update: 05-2022

samples

Dataset Access and Specifications

You can download the MMFlood dataset:

Structure

The dataset is organized in directories, with a JSON file providing metadata and other information such as the split configuration we selected. Its internal structure is as follows:

activations/
├─ EMSR107-1/
├─ .../
├─ EMSR548-0/
│  ├─ DEM/
│  │  ├─ EMSR548-0-0.tif
│  │  ├─ EMSR548-0-1.tif
│  │  ├─ ...
│  ├─ hydro/
│  │  ├─ EMSR548-0-0.tif
│  │  ├─ EMSR548-0-1.tif
│  │  ├─ ...
│  ├─ mask/
│  │  ├─ EMSR548-0-0.tif
│  │  ├─ EMSR548-0-1.tif
│  │  ├─ ...
│  ├─ s1_raw/
│  │  ├─ EMSR548-0-0.tif
│  │  ├─ EMSR548-0-1.tif
│  │  ├─ ...
activations.json

Last, the activations.json contains informations about each EMS activation, as extracted from the Copernicus Rapid Mapping site, as such:

{
    "EMSR107": {
        ...
    },
    "EMSR548": {
        "title": "Flood in Eastern Sicily, Italy",
        "type": "Flood",
        "country": "Italy",
        "start": "2021-10-27T11:31:00",
        "end": "2021-10-28T12:35:19",
        "lat": 37.435056244442684,
        "lon": 14.954437192250033,
        "subset": "test",
        "delineations": [
            "EMSR548_AOI01_DEL_PRODUCT_r1_VECTORS_v1_vector.zip"
        ]
    },
}

Data specifications

ImageDescriptionFormatBands
S1 rawGeoreferenced Sentinel-1 imagery, IW GRDGeoTIFF Float320: VV, 1: VH
DEMMapZen Digital Elevation ModelGeoTIFF Float320: elevation
Hydrogr.Binary map of permanent water basins, OSMGeoTIFF Uint80: hydro
MaskManually validated ground truth label, Copernicus EMSGeoTIFF Uint80: gt

Image metadata

Every image also contains the following contextual information, as GDAL metadata tags:

<GDALMetadata>
<Item name="acquisition_date">2021-10-31T16:56:28</Item>
  <Item name="code">EMSR548-0</Item>
  <Item name="country">Italy</Item>
  <Item name="event_date">2021-10-27T11:31:00</Item>
</GDALMetadata>

Code and installation

To run this code, simply clone it into a directory of choice and create a python environment.

git clone git@github.com:edornd/mmflood.git && cd mmflood
python3 -m venv .venv
pip install -r requirements.txt

Everything goes through the run command. Run python run.py --help for more information about commands and their arguments.

Data preparation

To prepare the raw data by tiling and preprocessing, you can run: python run.py prepare --data-source [PATH_TO_ACTIVATIONS] --data-processed [DESTINATION]

Training

Training uses HuggingFace accelerate to provide single-gpu and multi-gpu support. To launch on a single GPU:

CUDA_VISIBLE_DEVICES=... python run.py train [ARGS]

You can find an example script with parameters in the scripts folder.

Testing

Testing is run on non-tiled images (the preprocessing will format them without tiling). You can run the test on a single GPU using the test command. At the very least, you need to point the script to the output directory. If no checkpoint is provided, the best one (according to the monitored metric) will be selected automatically. You can also avoid storing outputs with --no-store-predictions.

CUDA_VISIBLE_DEVICES=... python run.py test --data-root [PATH_TO_OUTPUT_DIR] [--checkpoint-path [PATH]]

Data Attribution and Licenses

For the realization of this project, the following data sources were used: