Home

Awesome

<p align="center"> <h1>U-TILISE: A Sequence-to-sequence Model for Cloud Removal in Optical Satellite Time Series</h1> </p> <p align="center"> <h3 align="center"> <strong><sup>1</sup>Corinne Stucker, <sup>2</sup>Vivien Sainte Fare Garnot, <sup>1</sup>Konrad Schindler</strong> </h3> </p> <p align="center"> <strong><sup>1</sup> Chair of Photogrammetry and Remote Sensing, ETH Zurich</strong><br> <strong><sup>2</sup> Institute for Computational Science, University of Zurich</strong> </p> <p align="center"> <h3 align="center">[<a href="https://doi.org/10.1109/TGRS.2023.3333391">Paper</a>] [<a href="https://arxiv.org/abs/2305.13277">ArXiv</a>]</h3> </p>

Satellite image time series in the optical and infrared spectrum suffer from frequent data gaps due to cloud cover, cloud shadows, and temporary sensor outages. It has been a long-standing problem of remote sensing research how to best reconstruct the missing pixel values and obtain complete, cloud‑free image sequences. We approach that problem from the perspective of representation learning and develop U‑TILISE, an efficient neural model that is able to implicitly capture spatio-temporal patterns of the spectral intensities, and that can therefore be trained to map a cloud‑masked input sequence to a cloud‑free output sequence. The model consists of a convolutional spatial encoder that maps each individual frame of the input sequence to a latent encoding; an attention-based temporal encoder that captures dependencies between those per-frame encodings and lets them exchange information along the time dimension; and a convolutional spatial decoder that decodes the latent embeddings back into multi-spectral images. We experimentally evaluate the proposed model on EarthNet2021, a dataset of Sentinel-2 time series acquired all over Europe, and demonstrate its superior ability to reconstruct the missing pixels. Compared to a standard interpolation baseline, it increases the PSNR by 1.8 dB at previously seen locations and by 1.3 dB at unseen locations.

<image src="docs/teaser.png"/>

Setup

Dependencies

This code was developed using Ubuntu 22.04, Python 3.10, PyTorch 1.13, and CUDA 11.6. For an optimal experience, we recommend creating a new conda environment and installing the required dependencies with the following commands:

conda env create -f environment.yml
conda activate u-tilise

After setting up the environment, establish a corresponding IPython kernel named ``, . This is necessary to execute the demo Jupyter Notebook demo.ipynb:

ipython kernel install --user --name=u-tilise

Checkpoints

You can download our pretrained model checkpoints here. If you prefer an automated approach, execute the script below to download and extract all checkpoints to the ./checkpoints/ directory:

bash ./scripts/download_checkpoints.sh

We provide the following model checkpoints:

Sample data

For demonstration purposes, we offer the preprocessed iid test split of the EarthNet2021 dataset. To download this data, execute the following command:

bash ./scripts/download_data_earthnet2021.sh

This will download and unpack the data (~ 14 GB) into the ./data/ directory. The provided data sets are:

Data

EarthNet2021

EarthNet20211 provides Sentinel‑2 satellite image time series collected over Central and Western Europe from November 2016 to May 2020. Each time series comprises 30 images with Level-1C top-of-atmosphere (TOA) reflectances. The images are acquired in a regular temporal interval of five days. Every image is composed of the four spectral bands B2 (blue), B3 (green), B4 (red), and B8 (near-infrared) and covers a spatial extent of 128×128 pixels (2.56×2.56 km in scene space), resampled to the resolution of 20 m. Furthermore, the dataset includes pixel-wise cloud probability maps (training data only) and binary cloud (and cloud shadow) masks.

SEN12MS-CR-TS

SEN12MS-CR-TS2 provides globally sampled Sentinel‑2 satellite image time series from 2018 with a spatial extent of 256×256 pixels (2.56×2.56 km in scene space). Each time series contains 30 images. The images encompass all 13 spectral bands, upsampled to 10 m resolution. Furthermore, every optical image is paired with a spatially co-registered, temporally close (but not synchronous) C-band SAR image with two channels representing the $\sigma_0$ backscatter coefficients in the VV and VH polarizations, in units of decibels (dB). Furthermore, the dataset includes pixel-wise cloud probabilities and binary cloud masks.

Simulation of data gaps

To train and quantitatively assess U‑TILISE's performance, we use gap‑free (cloud‑free) Sentinel‑2 satellite image time series. Our preprocessing steps are as follows:

  1. We first identify all images with partially occluded pixels or images that are occluded/missing entirely by applying a threshold to the cloud probability maps (if available) or the binary cloud masks. We then remove all images with data gaps to produce cloud‑free time series that exhibit a valid observation for every spatio-temporal location.
  2. We discard time series with less than five remaining images, as we deem such sequences too short for learning spatio-temporal patterns.
  3. To generate synthetic data gaps, we randomly sample real cloud masks from other acquisition times and/or locations within the same Sentinel-2 tile and superimpose those masks onto the gap-free time series by setting the reflectance of occluded pixels (according to the masks) to the maximum value~1.

Custom dataset and data loader

Ensure the output of your custom data loader meets the following minimum requirements:

For visualization during training in Weights & Biases:

For visualization in demo.ipynb:

Preprocessing

:warning: Note: Skip this section if you don't intend to retrain U‑TILISE on the EarthNet2021 or the SEN12MS‑CR‑TS dataset.

To train U‑TILISE on the EarthNet2021 or the SEN12MS‑CR‑TS dataset, you will need to complete several steps: downloading the dataset, converting it from its native format to one compatible with our data loaders, and running a simulation to generate cloud‑free sequences with synthetically added data gaps. Here's a step-by-step guide:

EarthNet2021

  1. Dataset download

    Obtain the dataset by following the download instructions available here.

  2. Data conversion

    The EarthNet2021 dataset comprises two .npz files for every time series. We aggregate the data of each data split (i.e., train, iid, ood) into a single HDF5 file for further use. To run the npz-to-HDF5 conversion, execute the commands below:

    python ./toolbox/EarthNet2021_npz2hdf5.py --root_source <data_directory> --root_dest <output_directory> --split train --mode train
    python ./toolbox/EarthNet2021_npz2hdf5.py --root_source <data_directory> --root_dest <output_directory> --split train --mode val
    python ./toolbox/EarthNet2021_npz2hdf5.py --root_source <data_directory> --root_dest <output_directory> --split iid
    python ./toolbox/EarthNet2021_npz2hdf5.py --root_source <data_directory> --root_dest <output_directory> --split ood
    

    Replace data_directory with the root directory where you have saved your downloaded data and output_directory with your preferred destination for the HDF5 files.

    Upon executing the above commands, you should find the following HDF5 files in output_directory:

    • train.hdf5
    • iid_test_split.hdf5
    • ood_test_split.hdf5

    Besides the npz-to-HDF5 conversion, the script also stores the indices of unavailable frames and identifies all frames with partially or fully occluded pixels for each time series in the respective data split.

  3. Preprocessing of the validation split

    To generate time series with artificial data gaps for training, run the following command:

    python ./toolbox/simulate_dataset.py --config_file ./data/configs/config_earthnet2021_simulation_val.yaml --out_dir <output_directory> --out_hdf5_filename earthnet2021_val_simulation.hdf5
    

    Make sure to set root in the config_earthnet2021_simulation_val.yaml configuration file to match the output_directory used in step 2. Optionally, change max_seq_length to modify the fixed maximal temporal length $T$.

    This step creates a new HDF5 file /output_directory/earthnet2021_val_simulation.hdf5. Among others, this file contains the temporally trimmed cloud‑free validation time series, the corresponding time series with synthetically introduced data gaps, the masks used for masking, and the acquisition dates.

  4. Preprocessing of the test splits

    To generate time series with artificial data gaps for testing and evaluation, execute the command below:

    python ./toolbox/simulate_dataset.py --config_file ./data/configs/config_earthnet2021_simulation_test.yaml --out_dir <output_directory> --out_hdf5_filename earthnet2021_iid_test_split_simulation.hdf5
    

    Again, configure root in the config_earthnet2021_simulation_test.yaml configuration file to match the output_directory used in step 2. If you wish to process the ood test split instead, set the split parameter to ood and adjust --out_hdf5_filename accordingly.

    Note that max_seq_length in the config_earthnet2021_simulation_test.yaml configuration file is set to None to skip the temporal trimming of the test time series.

SEN12MS-CR-TS

  1. Dataset download

    Follow the provided instructions to download the dataset.

  2. Data conversion

    To convert and aggregate the .tif files from individual acquisitions into a single HDF5 file for each data split (i.e., train, val, test), use the functionalities provided here.

  3. Detection of real data gaps

    Execute the following script to identify all frames that have partial or complete occlusions:

    python ./toolbox/SEN12MSCRTS_detect_cloudy_frames.py --root <data_directory> --split train
    python ./toolbox/SEN12MSCRTS_detect_cloudy_frames.py --root <data_directory> --split val
    python ./toolbox/SEN12MSCRTS_detect_cloudy_frames.py --root <data_directory> --split test
    

    Upon completion, the HDF5 files created in step 2 will be updated with indices corresponding to images exhibiting data gaps.

  4. Preprocessing of the validation and test splits

    To generate time series with artificial data gaps for training and evaluation, execute the commands below:

    python ./toolbox/simulate_dataset.py --config_file ./data/configs/config_sen12mscrts_simulation_val.yaml --out_dir <output_directory> --out_hdf5_filename sen12mscrts_val_simulation.hdf5
    python ./toolbox/simulate_dataset.py --config_file ./data/configs/config_sen12mscrts_simulation_test.yaml --out_dir <output_directory> --out_hdf5_filename sen12mscrts_test_simulation.hdf5
    

    For detailed instructions on each parameter and step, refer to the instructions provided for the EarthNet2021 dataset above.

Training

To initiate training, execute the following command:

python run_train.py /path/to/config_file.yaml --save_dir <output_directory> 

where:

All training hyperparameters are predefined in default.yaml and set to the values used in the main experiments of the paper. If a parameter is specified in /path/to/config_file.yaml, it will override the default value.

To view all available training options, run:

python run_train.py -h

Example configuration files

We provide a collection of configuration files within the ./configs/ directory:

Upon execution, the run_train.py script combines the default.yaml with the provided --config_file, saving the resultant runtime configuration as a YAML file in the directory defined by --save_dir. As a point of reference, you can find an example runtime configuration at demo_train_config.yaml. We will use this YAML file below to demonstrate the evaluation procedure.

Evaluation

To perform the evaluation, execute the run_eval.py script using the following command:

python run_eval.py /path/to/config.yaml <method_name> --checkpoint <path_to_checkpoint> --test-data.data-dir <data_directory> --test-data.hdf5-file <hdf5_file_name> --test-data.split <data_split>

To view all available input options, run:

python run_eval.py -h

Examples

  1. To evaluate the iid test split of the EarthNet2021 dataset, use the following command:
python run_eval.py ./configs/demo_train_config.yaml utilise --test-data.data-dir ./data/ --test-data.hdf5-file earthnet2021_iid_test_split_simulation.hdf5 --test-data.split iid --checkpoint ./checkpoints/utilise_earthnet2021.pth 
  1. Similarly, to evaluate the linear interpolation baseline, run:
python run_eval.py ./configs/demo_train_config.yaml trivial --mode linear_interpolation --test-data.data-dir ./data/ --test-data.hdf5-file earthnet2021_iid_test_split_simulation.hdf5 --test-data.split iid

Demo

Inference

Check out the Jupyter Notebook demo.ipynb, which provides a step-by-step demonstration of using U‑TILISE to impute a given time series and visualize the associated attention masks.

Citation

@article{stucker2023u,
  title={{U-TILISE}: A Sequence-to-sequence Model for Cloud Removal in Optical Satellite Time Series},
  author={Stucker, Corinne and Garnot, Vivien Sainte Fare and Schindler, Konrad},
  journal={IEEE Transactions on Geoscience and Remote Sensing},
  year={2023},
  volume={61}
}

Acknowledgements

U‑TILISE extends the architecture of U-TAE to a full 3D spatio-temporal sequence-to-sequence model that preserves the temporal dimension.

We thank Vivien Sainte Fare Garnot for his efforts in open sourcing and maintaining U-TAE.3

Footnotes

  1. C. Requena-Mesa, V. Benson, M. Reichstein, J. Runge, and J. Denzler, EarthNet2021 :A large-scale dataset and challenge for earth surface forecasting as a guided video prediction task, in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021, pp.1132–1142.

  2. P. Ebel, Y. Xu, M. Schmitt, and X. X. Zhu, SEN12MS-CR-TS: A remote-sensing dataset for multimodal multitemporal cloud removal, IEEE Transactions on Geoscience and Remote Sensing, vol.60, pp. 1-14, 2022.

  3. V.S.F. Garnot and L. Landrieu, Panoptic segmentation of satellite image time series with convolutional temporal attention networks, in IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp.4872–4881.