Home

Awesome

Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds

Created by Huan Lei, Naveed Akhtar and Ajmal Mian

alt text

Introduction

This work is a significant extension of our original work presented in IEEE CVPR2019, and is accepted to TPAMI in March 2020.

We propose a spherical kernel for efficient graph convolution of 3D point clouds. Our metric-based kernels systematically quantize the local 3D space to identify distinctive geometric relationships in the data. Similar to the regular grid CNN kernels, the spherical kernel maintains translation-invariance and asymmetry properties, where the former guarantees weight sharing among similar local structures in the data and the latter facilitates fine geometric learning. The proposed kernel is applied to graph neural networks without edge-dependent filter generation, making it computationally attractive for large point clouds. In our graph networks, each vertex is associated with a single point location and edges connect the neighborhood points within a defined range. The graph gets coarsened in the network with farthest point sampling. Analogous to the standard CNNs, we define pooling and unpooling operations for our network. We demonstrate the effectiveness of the proposed spherical kernel with graph neural networks for point cloud classification and semantic segmentation using ModelNet, ShapeNet, RueMonge2014, ScanNet and S3DIS datasets.

In this repository, we release the code and trained models for classification and segmentation.

Citation

If you find our work useful in your research, please consider citing:

@article{lei2020spherical,  
  title={Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds},  
  author={Lei, Huan and Akhtar, Naveed and Mian, Ajmal},  
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},  
  year={2020}  
}
@article{lei2019octree,  
  title={Octree guided CNN with Spherical Kernels for 3D Point Clouds},  
  author={Lei, Huan and Akhtar, Naveed and Mian, Ajmal},  
  journal={IEEE Conference on Computer Vision and Pattern Recognition},  
  year={2019}  
}  

License

Our code is released under MIT License (see LICENSE file for details).

Installation

Install Tensorflow. The code was tested with Python 3.5, Tensorflow 1.12.0, Cuda 9.0 and Cudnn 7.1.4 on Ubuntu 16.04. The used GPU is NVIDIA Titan XP.
**Note: while implementing the new tensorlfow operators, we assumed that the GPU supports a block of 1024 threads.

Please compile the cuda-based operations in tf-ops folder using the command

(sudo) ./compile.sh

Data Preparation

You may need to install Matlab. It is required to preprocess the datasets, such as the grid-based downsampling.
We preprocess each segmentation dataset using the corresponding function under the folder preprocessing:

preprocessing/shapenet_removeSingularPoints.m
preprocessing/ruemonge2014_prepare_data.m.m
preprocessing/scannet_prepare_data.m
preprocessing/s3dis_prepare_data.m

And then transform the *.txt files to tfrecord format for fast data feeding in Tensorflow:

cd io
python make_tfrecord_modelnet.py 
python make_tfrecord_shapenet.py  
python make_tfrecord_ruemonge2014.py   
python make_tfrecord_scannet.py  
python make_tfrecord_s3dis.py    
python make_tfrecord_s3dis_no_split.py 

Usage

All of the trained models and our results on ShapeNet and S3DIS can be downloaded from this link.

Merging

The datasets are trained and tested with split blocks. We merge them back into complete scenes using functions under the folder post-merging in Matlab.

post-merging/scannet_merge.m
post-merging/s3dis_merge.m