Awesome

Meta-learning for crop mapping

This repository contains the implementation of Learning to predict crop type from heterogeneous sparse labels using meta-learning, published at the EarthVision workshop at CVPR 2021.

Pipeline

The main entrypoints into the pipeline are scripts. Specifically:

scripts/export.py exports data (locally, or to Google Drive, depending on what is being exported)
scripts/process.py processes the raw data
scripts/engineer.py combines the earth observation data with the labels to create (x, y) training data
scripts/maml.py trains the MAML model
scripts/test.py tests the trained MAML model by finetuning it on the test datasets
scripts/ensemble.py takes weights saved by test.py and ensembles them to create maps
scripts/pretrain.py trains a model on all data, for a transfer learning baseline

Two crop type maps created using few positive labelled points are available on Google Earth Engine:

Replicating experiments in the paper

Note: not all datasets used are public, so results cannot be exactly replicated.

Download the LEM+ dataset, and save it in data/raw/lem_brazil
Export the GeoWiki labels, by running export_geowiki in scripts/export.py
Process all the labels, by running scripts/process.py
Export the Sentinel Earth Engine tif files by running the other functions in scripts/export.py
Combine the labels and raw satellite imagery into (X, y) training data by running scripts/engineer.py
Train the MAML model by running maml.py. The MAML model and training results will be saved in data/maml_models/version_<VERSION>, where VERSION increments for each MAML run.
Finetune 10 MAML model with the following commands, bootstrapping the training data each run: (adding --test_mode {pretrained, random} will train the baseline models)

python maml_test.py --version <VERSION> --dataset Togo --many_n --num_cv 10  # Finetune on the Togo data across varying sample sizes
python maml_test.py --version <VERSION> --dataset coffee --num_samples {-1, 40} --num_cv 10  # Finetune on the coffee dataset for all negative samples, or 20 positive and 20 negative samples
python maml_test.py --version <VERSION> --dataset common_beans --num_samples {-1, 64}, --num_cv 10  # Finetune on the common beans dataset for all negative samples, or 32 positive and 32 negative samples

Setup

Anaconda running python 3.6 is used as the package manager. To get set up with an environment, install Anaconda from the link above, and (from this directory) run

conda env create -f environment.yml

This will create an environment named landcover-mapping with all the necessary packages to run the code. To activate this environment, run

conda activate landcover-mapping

Earth Engine

Earth engine is used instead of sentinel hub, because it is free. To use it, once the conda environment has been activated, run

earthengine authenticate

and follow the instructions. To test that everything has worked, run

python -c "import ee; ee.Initialize()"

Note that Earth Engine exports files to Google Drive by default (to the same google account used sign up to Earth Engine).

Running exports can be viewed (and individually cancelled) in the Tabs bar on the Earth Engine Code Editor. For additional support the Google Earth Engine forum is super helpful.

Tests

The following tests can be run against the pipeline:

pytest  # unit tests, written in the test folder
black .  # code formatting

Reference

If you find this code useful, please cite the following paper:

@InProceedings{Tseng_2021_CVPR,
    author    = {Tseng, Gabriel and Kerner, Hannah and Nakalembe, Catherine and Becker-Reshef, Inbal},
    title     = {Learning To Predict Crop Type From Heterogeneous Sparse Labels Using Meta-Learning},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
    month     = {June},
    year      = {2021},
    pages     = {1111-1120}
}