Awesome
Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding
Updates
To be Done
- Release the whole training scripts with CAGroup3D+Swin3D
- Upload the models and configs for FCAF3D+Swin3D
- Upload the models and configs for CAGroup3D+Swin3D
26/03/2024
Add Object Detction code:
- Update Object Detection code and configs with FCAF3D+Swin3D
- Update patch for CAGroup3D+Swin3D
27/04/2023
Initial commits:
- The supported code and models for Semantic Segmentation on ScanNet and S3DIS are provided.
Introduction
This repo contains the experiment code for Swin3D
Overview
Environment
-
Install dependencies
pip install -r requirements.txt
-
Refer to this repo to compile the operation of swin3d
git clone https://github.com/microsoft/Swin3D cd Swin3D python setup.py install
If you have problems installing the package, you can use the docker we provide:
docker pull yukichiii/torch112_cu113:swin3d
To run the code for object detection, please refer to FCAF3D(which is based on mmdetection3d) and CAGroup3D(which is based on OpenPCDet). Install the requirements for mmdetection3d and run python setup.py install
to install mmdetection3d.
Data Preparation
ScanNet Segmentation Data
Please refer to https://github.com/dvlab-research/PointGroup for the ScanNetv2 preprocessing. Then change the data_root entry in the yaml files in SemanticSeg/config/scannetv2
.
S3DIS Segmentation Data
Please refer to https://github.com/yanx27/Pointnet_Pointnet2_pytorch for S3DIS preprocessing. Then modify the data_root entry in the yaml files in SemanticSeg/config/s3dis
.
ScanNet 3D Detection Data
Please refer to https://github.com/SamsungLabs/fcaf3d for ScanNet preprocessing. Then modify the data_root entry in the config files in ObjectDet/FCAF3D/configs/scannet_det
.
S3DIS 3D Detection Data
Please refer to https://github.com/SamsungLabs/fcaf3d for S3DIS preprocessing. Then modify the data_root entry in the config files in ObjectDet/FCAF3D/configs/s3dis_det
.
Training
ScanNet Segmentation
Change the work directory to SemanticSeg
cd SemanticSeg
To train model on ScanNet Segmentation Task with Swin3D-S or Swin3D-L from scratch:
python train.py --config config/scannetv2/swin3D_RGBN_S.yaml
or
python train.py --config config/scannetv2/swin3D_RGBN_L.yaml
To finetune the model pretrained on Structured3D, you can download the pretrained model with cRSE(XYZ,RGB,Norm) here, and run:
python train.py --config config/scannetv2/swin3D_RGBN_S.yaml args.weight PATH_TO_PRETRAINED_SWIN3D_RGBN_S
or
python train.py --config config/scannetv2/swin3D_RGBN_L.yaml args.weight PATH_TO_PRETRAINED_SWIN3D_RGBN_L
S3DIS Segmentation
Change the work directory to SemanticSeg
cd SemanticSeg
To train model on S3DIS Area5 Segmentation with Swin3D-S or Swin3D-L from scratch:
python train.py --config config/s3dis/swin3D_RGB_S.yaml
or
python train.py --config config/s3dis/swin3D_RGB_L.yaml
To finetune the model pretrained on Structured3D, you can download the pretrained model with cRSE(XYZ,RGB) here, and run:
python train.py --config config/s3dis/swin3D_RGB_S.yaml args.weight PATH_TO_PRETRAINED_SWIN3D_RGB_S
or
python train.py --config config/s3dis/swin3D_RGB_L.yaml args.weight PATH_TO_PRETRAINED_SWIN3D_RGB_L
3D Object Detection
To train from sratch with FCAF3D+Swin3D:
python -m tools.train configs/scannet_det/Swin3D_S.py
To finetune the model pretrained on Structured3D, you can download the pretrained model with cRSE(XYZ,RGB), and run:
python -m tools.train configs/scannet_det/Swin3D_S.py --load_weights PATH_TO_PRETRAINED_SWIN3D_RGB_S
python -m tools.train configs/scannet_det/Swin3D_L.py --load_weights PATH_TO_PRETRAINED_SWIN3D_RGB_L
Evaluation
To forward Swin3D with given checkpoint with TTA(Test Time Augmentation, we random rotate the input scan and vote the result), you can download the model below and run:
ScanNet Segmentation
python test.py --config config/scannetv2/swin3D_RGBN_S.yaml --vote_num 12 args.weight PATH_TO_CKPT
or
python test.py --config config/scannetv2/swin3D_RGBN_L.yaml --vote_num 12 args.weight PATH_TO_CKPT
S3DIS Area5 Segmentation
python test.py --config config/s3dis/swin3D_RGB_S.yaml --vote_num 12 args.weight PATH_TO_CKPT
or
python test.py --config config/s3dis/swin3D_RGB_L.yaml --vote_num 12 args.weight PATH_TO_CKPT
For faster forward, you can change the vote_num
to 1.
3D Object Detection
For Detection task with FCAF3D+Swin3D:
python -m tools.test configs/scannet_det/Swin3D_S.py CHECKPOINT_PATH --eval mAP --show-dir OUTPUT_PATH --out OUTPUT_PATH/result.pkl
Results and models
ScanNet Segmentation
Pretrained | mIoU(Val) | mIoU(Test) | Model | Train | Eval | |
---|---|---|---|---|---|---|
Swin3D-S | ✗ | 75.2 | - | model | log | log |
Swin3D-S | ✓ | 75.6(76.8) | - | model | log | log |
Swin3D-L | ✓ | 76.4(77.5) | 77.9 | model | log | log |
S3DIS Segmentation
Pretrained | Area 5 mIoU | 6-fold mIoU | Model | Train | Eval | |
---|---|---|---|---|---|---|
Swin3D-S | ✗ | 72.5 | 76.9 | model | log | log |
Swin3D-S | ✓ | 73.0 | 78.2 | model | log | log |
Swin3D-L | ✓ | 74.5 | 79.8 | model | log | log |
ScanNet 3D Detection
Pretrained | mAP@0.25 | mAP@0.50 | Model | Log | |
---|---|---|---|---|---|
Swin3D-S+FCAF3D | ✓ | 74.2 | 59.5 | model | log |
Swin3D-L+FCAF3D | ✓ | 74.2 | 58.6 | model | log |
Swin3D-S+CAGroup3D | ✓ | 76.4 | 62.7 | model | log |
Swin3D-L+CAGroup3D | ✓ | 76.4 | 63.2 | model | log |
S3DIS 3D Detection
Pretrained | mAP@0.25 | mAP@0.50 | Model | Log | |
---|---|---|---|---|---|
Swin3D-S+FCAF3D | ✓ | 69.9 | 50.2 | model | log |
Swin3D-L+FCAF3D | ✓ | 72.1 | 54.0 | model | log |
Citation
If you find Swin3D useful to your research, please cite our work:
@misc{yang2023swin3d,
title={Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding},
author={Yu-Qi Yang and Yu-Xiao Guo and Jian-Yu Xiong and Yang Liu and Hao Pan and Peng-Shuai Wang and Xin Tong and Baining Guo},
year={2023},
eprint={2304.06906},
archivePrefix={arXiv},
primaryClass={cs.CV}
}