Awesome
CleanerS
This repository contains the official PyTorch implementation of the following CVPR 2023 paper:
Title: CleanerS: Semantic Scene Completion with Cleaner Self PDF
Author: Fengyun Wang, Dong Zhang, Hanwang Zhang, Jinhui Tang, Qianru Sun
Affiliation: NJUST, HKUST, NTU, SMU
Abstract
<p align="justify"> Semantic Scene Completion (SSC) transforms an image of single-view depth and/or RGB 2D pixels into 3D voxels, each of whose semantic labels are predicted. SSC is a well-known ill-posed problem as the prediction model has to "imagine" what is behind the visible surface, which is usually represented by Truncated Signed Distance Function (TSDF). Due to the sensory imperfection of the depth camera, most existing methods based on the noisy TSDF estimated from depth values suffer from 1) incomplete volumetric predictions and 2) confused semantic labels. To this end, we use the ground-truth 3D voxels to generate a perfect visible surface, called TSDF-CAD, and then train a "cleaner" SSC model. As the model is noise-free, it is expected to focus more on the "imagination" of unseen voxels. Then, we propose to distill the intermediate "cleaner" knowledge into another model with noisy TSDF input. In particular, we use the 3D occupancy feature and the semantic relations of the "cleaner self" to supervise the counterparts of the "noisy self" to respectively address the above two incorrect predictions. Experimental results validate that our method improves the noisy counterparts with 3.1% IoU and 2.2% mIoU for measuring scene completion and SSC, and also achieves new state-of-the-art accuracy on the popular NYU dataset.Overall architecture
<p align="justify"> CleanerS mainly soncists of two networks: a teacher network, and a student network. These two networks share same architectures but have different weights. The distillation pipelines include a feature-based cleaner surface distillation (i.e., KD-T), and logit-based cleaner semantic distillations (i.e., KD-SC and KD-SA).Pre-trained model
Segformer-B2 | Model Zoo | Visual Results |
---|---|---|
Teacher Model | Google Drive / Baidu Netdisk with code:3gew | Google Drive / Baidu Netdisk with code:p9nl |
Student Model | Google Drive / Baidu Netdisk with code:6eja | Google Drive / Baidu Netdisk with code:lktg |
Comparisons with SOTA
Usage
Requirements
- Pytorch 1.10.1
- cudatoolkit 11.1
- mmcv 1.5.0
- mmsegmentation 0.27.0
Suggested installation steps:
conda create -n CleanerS python=3.7 -y
conda activate CleanerS
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.1 -c pytorch -c conda-forge
pip install mmcv-full==1.5.0 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.10/index.html
pip install mmsegmentation==0.27.0
conda install scikit-learn
pip install pyyaml timm tqdm EasyConfig multimethod easydict termcolor shortuuid imageio
Data preparation
We follow the project of 3D-Sketch for dataset preparing.
After preparing, your_SSC_Dataset
folder should look like:
-- your_SSC_Dataset
| NYU
|-- TSDF
|-- Mapping
| |-- trainset
| |-- |-- RGB
| |-- |-- depth
| |-- |-- GT
| |-- testset
| |-- |-- RGB
| |-- |-- depth
| |-- |-- GT
| NYUCAD
|-- TSDF
| |-- trainset
| |-- |-- depth
| |-- testset
| |-- |-- depth
Training
- on Segformer-B2
- Download the pretrained Segformer-B2, mit_b2.pth;
- (optional) Download the teacher model and put it into
./teacher/Teacher_ckpt.pth
; - Run
run.sh
for training the CleanerS (if you skip the step 2, it will train both teacher and student models).
- on ResNet50
- Download the pretrained ResNet50.
Testing with our weights
- Download our weights and then put it in the
./checkpoint
folder. - Run
python test_NYU.py --pretrained_path ./checkpoint/CleanerS_ckpt.pth
. The visualized results will be in the./visual_pred/CleanerS
folder. - (optional) Run
python test_NYU.py --pretrained_path ./checkpoint/Teacher_ckpt.pth
to get the results of the teacher model.
Citation
If this work is helpful for your research, please consider citing:
@inproceedings{wang2023semantic,
title={Semantic scene completion with cleaner self},
author={Wang, Fengyun and Zhang, Dong and Zhang, Hanwang and Tang, Jinhui and Sun, Qianru},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={867--877},
year={2023}
}
TODO list
- switchable 2DNet for both Segformer-B2 and ResNet50
Acknowledgement
This code is based on 3D-Sketch.