Home

Awesome

Proba-V Super Resolution Challenge

Submission Report for Incubit Technical Assignment

The following report outlines the given problem, tackles data preprocessing, model selection and presents results. I'd like to draw special attention to the conclusion, which holds two interesting approaches that weren't tackled in these experiments due to time constraint.

Problem Statement

Proba-V refers to the earth observation satellite that actively maps land cover and, more specifically, vegetation growth.

The satellite captures an average of 15 low resolution images of every covered location, which are to be 'super resolved' into high resolution images.

Of these 15 images, many are obscured by artefacts such as clouds, providing an additional challenge of how to circumvent such obscurations.

Each low resolution image is accompanied by a "Quality Map" in which black pixels indicate the position of an obscuration.

Thus we have two problems that ought to be tackled:

  1. Interpolate obscured pixels and set up training data
  2. Construct and test super resolution neural network architectures

Data Approach

In this section the data preprocessing and training data set up are described.

Filling out obscured pixels

Under normal circumstances, a technique such as bicubic interpolation could be used to fill out obscured pixels of the low resolution images. However, as we have multiple images of the same scene, we can fill out obscured pixels with the mean or median of specific pixels across all of that scenes images.

The above image shows interpolation of obscured pixels using the median.

Setting up training data

Two training data setup options are considered.

Considering that images within a scene are the same with mostly minor aberrations, we opt for training using option 2. Option 2 is computationally lighter by being smaller, but can make up for it through data augmentation techniques.

Methodology

There have been many exciting neural network based super resolution models published in the past few years, making the decision of which to use quite difficult. For this reason, the constraints of the project are firstly introduced.

Project Constraints

Model selection

In the table below a few considered models and their papers are presented.

Model NameModel TypeComments
Fast Super Resolution CNN (FSRCNN)CNNStraight forward to train and has seen promising results (2015 Paper)
Channel-Wise and Spatial Feature Modulation network (CSFM)CNN with feature-modulation memoryEstablishes long-term skip connections (2018 Paper)
Laplacian Pyramid Super-Resolution Network (LapSN)Progressive CNNImproved parameter sharing which results in reduced computational complexity (2018 Paper)
Super Resolution Generative Adversarial Network (SRGAN)GANHas seen impressive results. Difficult to train. (2017 Paper)

It was decided not to pursue the GAN as they are notoriously unstable to train and carry the risk of taking up a bulk amount of time that could be better spent elsewhere. If more time was available, the GAN would likely have been one of the first experiments.

Final Selection

For the experiment we opt for the following two models:

  1. Fast Super Resolution CNN (FSRCNN)

As Occam's Razor states, sometimes the simplest solution is the best. Although still widely in use, this is a relatively simple and straight-forward network consisting only of three parts.

Namely a filter, a non-linear mapping and then reconstruction. It was decided to start-off with this simpler model and then work our way up to more complex models once the problem is better understood.

  1. Channel-Wise and Spatial Feature Modulation network (CSFM)

The second architecture is more complex and makes use of special "FMM" modules for improved feature extraction and skip connections, which help carry spatial information to later layers. This information often gets lost when a CNN is deep.

These two models make an interesting comparison as their performance will tell us a lot about the data. For one, we get an indication from the FSRCNN as to whether the problem needs larger model capacity. From the second model we get information as to whether the carrying of long-term information across the network is useful for our case in which spatial features might be vague.

Experimental Setup

For our experiments, three selected architectures are tested against the baseline bicubic interpolation method.

Results

In the below figure a sample from the testing data used to make predictions with the chosen models and bicubic interpolation.

Looking closer at the zoomed in section of the image, both the FSRCNN and the CFSM models produced smoother edges compared to bicubic interpolation.

We could measure the performance of the different techniques by calculating the mean squared error (MSE) of the test set for every model. The MSE can be interpreted as how close a resolved image is to its high resolution partner.

We see that both models outperformed bicubic sampling. This is a very positive result as beating bicubic sampling was the aim of the Proba-V challenge.

Conclusion

Our models both performed well in comparison to bicubic sampling and the insight gained from these experiments can support future experiments in improving super resolved Proba-V images.

It essentially involves slowly growing the resolution of a GAN. This allows the model to first learn low level features, which eases the learning of high level features later on. This architecture could be translated into a super resolution problem and would make an incredibly exciting research topic.

Usage

  1. Prepare data

    'python3 data_utilities.py' can be run and will download + prepare the data for you.

    Any customisation thereof can be done through variables defined at the bottom of the script.

  2. Modelling script

    Run 'python3 main.py' to train your model.

    After training is complete, testing output will automatically be saved. Intermediate graphs will be saved to ./plots/ whilst testing results will be saved to ./result/

    Following arguments are important:

    • --arch: Specifies model type ["FSRCNN" or "CSFM"]
    • --train: True if train and False if testing
    • --epoch: the number of iterations to train
    • --learning_rate: LR for Adam optimizer
    • --batch_size: default is 32

References: