Home

Awesome

DFC2021 MSD Baseline

Jump to: Baselines | Running experiments | Results | Visualizations

This repo contains implementations of several baseline for the "Multitemporal Semantic Change Detection" (MSD) track of the 2021 IEEE GRSS Data Fusion Competition (DFC2021). See the CodaLab page for more information about the competition, including the current leaderboard!

If you make use of this implementation in your own project or want to refer to it in a scientific publication, please consider referencing this GitHub repository and citing our paper:

@Article{malkinDFC2021,
  author  = {Kolya Malkin and Caleb Robinson and Nebojsa Jojic},
  title   = {High-resolution land cover change from low-resolution labels: Simple baselines for the 2021 IEEE GRSS Data Fusion Contest},
  year    = {2021},
  journal = {arXiv:2101.01154}
}

Environment setup

The following will setup up a conda environment suitable for running the scripts in this repo:

conda create -n dfc2021 "python=3.8"
conda activate dfc2021
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
conda install tifffile matplotlib pandas
pip install rasterio fiona segmentation-models-pytorch

# optional steps to install a jupyter notebook kernel for this environment
pip install ipykernel
python -m ipykernel install --user --name dfc2021

Baselines

The accompanying arxiv paper compares U-Nets and small fully convolutional neural networks on the task of computing high-resolution land cover change when only given different low-resolution labels. We provide an implementation to reproduce the main results from this paper, and supplemental results that further compare how training a single model with multiple years of data compares to training individual models for each year. We do not include the steps for reproducing the "FCN / tile" experiment, however this experiment can be scripted using the same training/inference tools detailed below.

The 5 baseline methods we describe are:

Note that there are multiple ways to use a model that predicts NLCD labels to generate land cover change predictions for the competition. We implement two in this training/inference pipeline, however only report results for the second method in the paper:

  1. (Hard assignment) Using a hard mapping between NLCD classes and reduced land cover classes. Here, each NLCD class is mapped to one of the 4 reduced land cover classes-- see the competition details or utils.py for this mapping. This is the default behaviour in the inference.py and independent_pairs_to_predictions.py scripts.
  2. (Soft assignment) Using a soft mapping as described in the accompanying arxiv paper. To generate results with this method use the --save_soft flag when running inference.py to save output files that contain quantized per class probabilities, then use the --soft_assignment flag when running independent_pairs_to_predictions.py. NOTE: the outputs from this process will be much larger than the first.

Running experiments

Each of the following subsections gives the set of commands needed to reproduce a CodaLab submission file for the described baseline methods.

NLCD difference baseline

conda activate dfc2021
python create_nlcd_only_baseline.py --output_dir results/nlcd_only_baseline/output/
python independent_pairs_to_predictions.py --input_dir results/nlcd_only_baseline/output/ --output_dir results/nlcd_only_baseline/submission/
cd results/nlcd_only_baseline/submission/
zip -9 -r ../nlcd_only_baseline.zip *.tif

U-Net both baseline

conda activate dfc2021
python train.py --input_fn data/splits/training_set_naip_nlcd_both.csv --output_dir results/unet_both_baseline/ --save_most_recent --num_epochs 10 2> /dev/null
python inference.py --input_fn data/splits/val_inference_both.csv --model_fn results/unet_both_baseline/most_recent_model.pt --output_dir results/unet_both_baseline/output/
python independent_pairs_to_predictions.py --input_dir results/unet_both_baseline/output/ --output_dir results/unet_both_baseline/submission/
cd results/unet_both_baseline/submission/
zip -9 -r ../unet_both_baseline.zip *.tif

U-Net separate baseline

conda activate dfc2021
python train.py --input_fn data/splits/training_set_naip_nlcd_2013.csv --output_dir results/unet_2013_baseline/ --save_most_recent --num_epochs 10 2> /dev/null
python train.py --input_fn data/splits/training_set_naip_nlcd_2017.csv --output_dir results/unet_2017_baseline/ --save_most_recent --num_epochs 10 2> /dev/null

python inference.py --input_fn data/splits/val_inference_2013.csv --model_fn results/unet_2013_baseline/most_recent_model.pt --output_dir results/unet_2013_baseline/output/
python inference.py --input_fn data/splits/val_inference_2017.csv --model_fn results/unet_2017_baseline/most_recent_model.pt --output_dir results/unet_2017_baseline/output/

mkdir -p results/unet_separate_baseline/output/
mkdir -p results/unet_separate_baseline/submission/
mv results/unet_2013_baseline/output/*.tif results/unet_separate_baseline/output/
mv results/unet_2017_baseline/output/*.tif results/unet_separate_baseline/output/

python independent_pairs_to_predictions.py --input_dir results/unet_separate_baseline/output/ --output_dir results/unet_separate_baseline/submission/
cd results/unet_separate_baseline/submission/
zip -9 -r ../unet_separate_baseline.zip *.tif

FCN both baseline

conda activate dfc2021
python train.py --input_fn data/splits/training_set_naip_nlcd_both.csv --output_dir results/fcn_both_baseline/ --save_most_recent --model fcn --num_epochs 10 2> /dev/null
python inference.py --input_fn data/splits/val_inference_both.csv --model_fn results/fcn_both_baseline/most_recent_model.pt --output_dir results/fcn_both_baseline/output/ --model fcn
python independent_pairs_to_predictions.py --input_dir results/fcn_both_baseline/output/ --output_dir results/fcn_both_baseline/submission/
cd results/fcn_both_baseline/submission/
zip -9 -r ../fcn_both_baseline.zip *.tif

FCN separate baseline

conda activate dfc2021
python train.py --input_fn data/splits/training_set_naip_nlcd_2013.csv --output_dir results/fcn_2013_baseline/ --save_most_recent --model fcn --num_epochs 10 2> /dev/null
python train.py --input_fn data/splits/training_set_naip_nlcd_2017.csv --output_dir results/fcn_2017_baseline/ --save_most_recent --model fcn --num_epochs 10 2> /dev/null

python inference.py --input_fn data/splits/val_inference_2013.csv --model_fn results/fcn_2013_baseline/most_recent_model.pt --output_dir results/fcn_2013_baseline/output/ --model fcn
python inference.py --input_fn data/splits/val_inference_2017.csv --model_fn results/fcn_2017_baseline/most_recent_model.pt --output_dir results/fcn_2017_baseline/output/ --model fcn

mkdir -p results/fcn_separate_baseline/output/
mkdir -p results/fcn_separate_baseline/submission/
mv results/fcn_2013_baseline/output/*.tif results/fcn_separate_baseline/output/
mv results/fcn_2017_baseline/output/*.tif results/fcn_separate_baseline/output/

python independent_pairs_to_predictions.py --input_dir results/fcn_separate_baseline/output/ --output_dir results/fcn_separate_baseline/submission/
cd results/fcn_separate_baseline/submission/
zip -9 -r ../fcn_separate_baseline.zip *.tif

Results

NOTE: These results are from runs with --num_epochs 10 and with hard assignment. We have observed that training longer and using soft assignment gives better results.

ClassNLCD differenceU-Net bothU-Net separateFCN bothFCN separate
Water loss0.14810.27510.33810.63910.6712
Tree Canopy loss0.16680.48280.47310.62990.6725
Low Vegetation loss0.28180.47690.46670.45950.5504
Impervious loss0.01440.29140.26690.23810.2627
Water gain0.03100.15770.24170.21260.1534
Tree Canopy gain0.00080.14780.24110.11810.1924
Low Vegetation gain0.10580.35100.34650.50780.5562
Impervious gain0.36220.51630.51420.54490.5651
Average0.13890.33740.36100.41880.4530

Visualizations

See the notebook here for examples of how to create the following types of figures:

<p align="center"> <img src="images/fcn_unet.png" width="430"/> </p>