Awesome

Generalized Sparse Convolutional Neural Networks for Semantic Segmentation of Point Clouds Derived from Tri-Stereo Satellite Imagery

This GitHub repository contains the machine learning models described in Stefan Bachhofner, Ana-Maria Loghin, Michael Hornacek, Johannes Otepka, Andrea Siposova, Niklas Schmindinger, Norbert Pfeiffer, Kurt Hornik, Nikolaus Schiller, Olaf Kähler, Ronald Hochreiter: Generalized Sparse Convolutional Neural Networks for Semantic Segmentation of Point Clouds Derived from Tri-Stereo Satellite Imagery.

@article{remoteSensing2020gscnn,
	title={Generalized Sparse Convolutional Neural Networks for Semantic Segmentation of Point Clouds Derived from Tri-Stereo Satellite Imagery},
	author={Bachhofner, Stefan and Loghin, Ana-Maria and Otepka, Johannes and Pfeifer, Norbert and Hornacek, Michael and Siposova, Andrea and Schmidinger, Niklas and Hornik, Kurt and Schiller, Nikolaus and K\"ahler, Olaf and Hochreiter, Ronald},
	journal={Remote Sensing},
	volume={12},
	number={8},
	article-number={1289},
	year={2020},
	month={April},
	day={18},
	publisher={Multidisciplinary Digital Publishing Institute},
    url={https://www.mdpi.com/2072-4292/12/8/1289#cite},
    issn={2027-4292},
    doi={10.3390/rs12081289}
}

ToDo List

Add docker
Add python training script for GSCNN
Add R training script for the decission tree
Release Data to public (if possible)

1. Instructions 1.1 Installation Instructions 1.2 Usage Instructions 2. Paper 2.1 Abstract 2.3 Tables and Figures 2.3.1 Segmentation Results 2.3.2 Study Area 3. General Information 3.1 Authors by Institution 3.2 Project Partners 3.3 Funding

Instructions

Installation Instructions

Requirements

With PytorchHub (Recommended)

Ubuntu 14.04 or higher
Python 3.6 or higher
CUDA 10.0 or higher
pytorch 1.3 or higher

Without PytorchHub

Ubuntu 14.04 or higher
Python 3.6 or higher
CUDA 10.0 or higher
pytorch 1.2 or higher

Installation

We recommend that you use anaconda to separate the environment.

The following command creates the conda environment py3-mink and installs the necessary python dependencies.

conda env create -f py3-mink.yml

To install the Minkowski Engine in the created environment run

conda activate py3-mink
sh install_minkowski_engine.sh

Usage Instructions

import torch
import MinkowskiEngine as ME

# For loading LiDar files
from laspy.file import File


device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')


def predict(model, features, coordinates):
    '''
        Takes the given model and returns its predictions for the given features,
        and coordinates. Note that only the features are used for making the predictions.

        The predictions are sent back to the cpu and returned as a numpy array.
    '''
    model.eval()
    model.to(device)

    point_cloud = ME.SparseTensor(features, coords=coordinates).to(device)

    with torch.no_grad():
        loss = model(point_cloud)

    _, y_pred = torch.max(loss.F, dim=1)

    return y_pre.cpu().numpy()


def load_point_cloud(path_to_point_cloud):
    '''
        Opens a point_cloud in read mode.
    '''
    return File(path_to_point_cloud, mode="r")


def load_coordinates_from_point_cloud(path_to_point_cloud):
    '''
        Returns a numpy array for the point clouds coordinates.
    '''
    point_cloud = load_point_cloud(path_to_point_cloud=path_to_point_cloud)
    coordinates = np.vstack([point_cloud.X, point_cloud.Y, point_cloud.Z]).transpose()
    return coordinates


def normalize_coordinates(coordinates, denominator=10000):
    '''
        Normalizes the given coordinates, i.e. all coordinates are then in the range
        [0, 1].
    '''
    return np.divide(coordinates, denominator)


model = torch.hub.load('MacOS/ReKlaSat-3D', 'coordinates')
coordinates = load_coordinates_from_point_cloud(path_to_point_cloud="./data/my_point_cloud.laz")
features = normalize_coordinates(coordinates=coordinates)
y_pre = predict(model=model, features=features, coordinates=coordinates)

Examples

Get a list of all entrypoints we provide

import torch

entrypoints = torch.hub.list('MacOS/ReKlaSat-3D', force_reload=True)

print(entrypoints)

Load the model and weights from the coordinates experiments

import torch

model = torch.hub.load('MacOS/ReKlaSat-3D', 'coordinates')

import torch

model = torch.hub.load('MacOS/ReKlaSat-3D', 'coordinates_epoch', epoch=40)

Load the model and weights from the coordinates and colors experiments

import torch

model = torch.hub.load('MacOS/ReKlaSat-3D', 'coordinates_colors')

import torch

model = torch.hub.load('MacOS/ReKlaSat-3D', 'coordinates_colors_epoch', epoch=40)

Load the model and weights from the coordinates and colors with median class weights experiments

import torch

model = torch.hub.load('MacOS/ReKlaSat-3D', 'coordinates_colors_weighted')

import torch

model = torch.hub.load('MacOS/ReKlaSat-3D', 'coordinates_colors_weighted_epoch', epoch=149)

Only get MinkUNet34C

import torch

model = torch.hub.load('MacOS/ReKlaSat-3D', 'get_minkunet34c')

Paper

Abstract

We studied the applicability of point clouds derived from tri-stereo satellite imagery for semantic segmentation for generalized sparse convolutional neural networks by the example of an Austrian study area. We examined, in particular, if the distorted geometric information, in addition to color, influences the performance of segmenting clutter, roads, buildings, trees, and vehicles. In this regard, we trained a fully convolutional neural network that uses generalized sparse convolution one time solely on 3D geometric information (i.e., 3D point cloud derived by dense image matching), and twice on 3D geometric as well as color information. In the first experiment, we did not use class weights, whereas in the second we did. We compared the results with a fully convolutional neural network that was trained on a 2D orthophoto, and a decision tree that was once trained on hand-crafted 3D geometric features, and once trained on hand-crafted 3D geometric as well as color features. The decision tree using hand-crafted features has been successfully applied to aerial laser scanning data in the literature. Hence, we compared our main interest of study, a representation learning technique, with another representation learning technique, and a non-representation learning technique. Our study area is located in Waldviertel, a region in Lower Austria. The territory is a hilly region covered mainly by forests, agriculture, and grasslands. Our classes of interest are heavily unbalanced. However, we did not use any data augmentation techniques to counter overfitting. For our study area, we reported that geometric and color information only improves the performance of the Generalized Sparse Convolutional Neural Network (GSCNN) on the dominant class, which leads to a higher overall performance in our case. We also found that training the network with median class weighting partially reverts the effects of adding color. The network also started to learn the classes with lower occurrences. The fully convolutional neural network that was trained on the 2D orthophoto generally outperforms the other two with a kappa score of over 90% and an average per class accuracy of 61%. However, the decision tree trained on colors and hand-crafted geometric features has a 2% higher accuracy for roads.

Tables and Figures

Segmentation Results

Quantitative overall comparison of the GSCNN, FCN-8s, and the decision tree. We use six conventionally used metrics obtained from the segmentation results. We highlight the best values for each metric (hence in each column) in bold. And the best values among the GSCNN models in italic. Please see the paper for a class level comparison.

Models	Avg. Precision	Avg. Recall	Avg. F1	Kappa	OA	Avg per Class Acc.
	%	%	%	%	%
baseline A	12.85	20.00	15.64	47.33	64.25	20.00
U-Net based GSCNN (3D)
Coordinates	23.69	24.33	23.30	38.90	56.01	24.32
Coordinates, Colors	19.31	19.98	17.38	45.07	62.14	19.97
Coordinates, Colors, W.L.	21.92	22.24	21.36	34.30	51.07	22.22
FCN-8s (2D)
Colors	62.43	61.15	59.12	90.76	96.11	61.15
Decision Tree (3D)
Coordinates	43.89	38.73	39.54	82.00	89.10	38.73
Coordinates, Colors	61.03	58.72	58.96	86.60	93.18	58.71

Overall accuracy progress over epochs for the GSCNN models. Here, only the first 50 epochs of the model that uses the weighted loss is shown.

<object data="./overall_accuracy_gscnn_models.pdf" type="application/pdf" width="1080px" height="520px"> <embed src="./overall_accuracy_gscnn_models.pdf"> This browser does not support PDFs. Please download the PDF to view it: <a href="./overall_accuracy_gscnn_models.pdf">Download PDF</a>. </embed> </object>

Study Area

Waldviertel, Lower Austria: (a) Overview map of Austria with marked location of study area; (b) Pléiades orthophoto of Waldviertel; the selected area used for semantic segmentation is marked with yellow.

Examples of point clouds derived form tri-stereo satellite imagery for each class: (a) Clutter; (b) Roads; (c) Buildings; (c) Trees; (e) Vehicles.

General Information

Authors by Institution

Project Partners

Vienna University of Economics and Business, Research Institute for Computational Methods. (Projet Coordinator)

Vienna University of Technology, Department of Geodesy and Geoinformation.

Siemens AG Österreich, Corporate Technology.

Vermessung Schmid ZT GmbH.

Federal Ministry of Defence, Austria.

Funding

This research was funded by the Austrian Research Promotion Agency (FFG) project “3D Reconstruction and Classification from Very High Resolution Satellite Imagery (ReKlaSat 3D)” (grant agreement No. 859792).