Awesome
Satellite Image Segmentation Using PyTorch
This repo contains a U-Net implementation for satellite segmentation
Data preparation
Due to a severe lack of training data, several pre-processing steps are taken to try and
alleviate this. First, the annotations are converted from their json files to image masks. Then, the satellite images are further tiled
down, to tiles of size (512x512)
, so that the images can be fed into a fully convolutional network (FCN) for semantic segmentation.
The augmentations are:
blur
: combination of median and bilateral blurbright increase
: increase brightness artificiallydistort
: elastic deformation of imagegaussian blur
: gaussian blurringHSV
: convert channels to HSVmedial blur
: median blurmirror
: mirror imagerotation invariance
: apply rotation invariance (seedata_utils.augment.rotation_invariance()
for details)crop + resize
: crop image randomly and resize to size expected by network (crop is applied to both image and mask)
sample of applied augmentations
blur | bright increase | distort | gaussian blur |
---|---|---|---|
HSV shift | median blur | mirror | gaussian blur + rotation |
---|---|---|---|
distort + rot | rotation invariance | crop + resize |
---|---|---|
Network & Training
U-Net is used as a base model for segmentation. The original intention was to use U-Net to show base results, and then train PSPNet (Pyramid Scene Parsing Network) using a pretrained satellite segmentation model and show comparisons, but time did not allow for this.
training parameters
U-Net was trained for 50 43 epochs, with a batch size of 4.
The base parameters for training can be seen, and adjusted, in run_training.py
:
def setup_run_arguments():
args = EasyDict()
args.epochs = 50
args.batch = 4
args.val_percent = 0.2
args.n_classes = 4
args.n_channels = 3
args.num_workers = 8
args.learning_rate = 0.001
args.weight_decay = 1e-8
args.momentum = 0.9
args.save_cp = True
args.loss = "CrossEntropy"
The trained model is provided in checkpoints/unet-augment-final.pth
Generating json annotations from U-Net predictions
The expected output is a json annotated file containing the vector points corresponding to the classes.
A function for generating such file is found in predict.prediction_to_json(...)
. A python notebook
is provided showing how to generate the json file, as well as how to generate a color mask file for the json file.
The example: example.ipynb shows how to load the trained model and use it to create the annotation file. It also shows how to create a colored image masks directly with the annotation. (The example notebook was created with u-net after 35 epochs)
Discussion
There are several improvements that can be made. For starters, some augmentation methods used could be replaced or left out entirely, such as HSV. Simple color shift could have been used instead. Another major issue that should have been addressed at the beginning was class imbalance. It would have been better to apply augmentations with respect to the class frequency, trying to shift the infrequency balance.
Another obvious issue is that U-Net was a network catered to medical image segmentation, but it is often used as a baseline mode because it is small and easy to implement. A more suitable network would have been PSPNet, as mentioned above. Similarly, there exists several pretrained model that could have been used with transfer learning. This, coupled with meaningful augmentations, would have yielded better model/results.