Awesome
Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models
Created by Ziyi Wang*, Xumin Yu*, Yongming Rao, Jie Zhou, Jiwen Lu.
This repository is a pyTorch implementation of our ICCV 2023 paper TAP (short for Take-A-Photo).
TAP is a generative pre-training method for any point cloud models. Given point cloud features extracted from backbone models, we generate view images from different instructed poses and calculate pixel-wise loss on image pixels as the pre-training scheme. Our pre-training method shows superior results on ScanObjectNN classification and ShapeNetPart segmentation than other generative pre-training methods based on mask modeling.
Preparation
Installation Prerequisites
- Python 3.7
- CUDA 11.3
- PyTorch 1.10.1
- torch_scatter
- open3d, einops, cv2
- timm 0.5.4
conda create -n tap python=3.7 numpy=1.20 numba
conda activate tap
conda install -y pytorch=1.10.1 torchvision cudatoolkit=11.3 -c pytorch -c nvidia
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.10.1+cu113.html
pip install -r requirements_openpoints.txt
pip install open3d einops opencv-python
pip install timm==0.5.4
cd openpoints/cpp/pointnet2_batch
python setup.py install
cd ../pointops
python setup.py install
cd ../subsampling
python setup.py install
cd ../../..
Data Preparation
ShapeNet Dataset
-
Download the 12 views image dataset of ShapeNet from here. The images are rendered by MVCNN.
-
Download the point cloud dataset that is corresponding to the ShapeNet image dataset above from here(Google) or here(Tsinghua Cloud). The point clouds are sampled by us.
-
(Optional) Or you can download the 12 views image dataset with object mesh files of ShapeNet from here. Then you can sample point clouds by yourself from
.off
files viaopenpoints/dataset/sample_pc.py
.
ScanObjectNN Dataset
- Download the official ScanObjectNN dataset from here.
ShapeNetPart Dataset
- Download the official ShapeNetPart dataset from here.
Data File Structure
```
TAP/
|-- data/
|-- ShapeNet55/
|-- poinyclouds
|-- train/
|-- model_000003.ply
|-- ...
|-- val/
|-- model_000009.ply
|-- ...
|-- test/
|-- model_000001.ply
|-- ...
|-- shapenet55v1/
|-- train/
|-- model_000003_001.jpg
|-- ...
|-- val/
|-- model_000009_001.jpg
|-- ...
|-- test/
|-- model_000001_001.jpg
|-- ...
|-- ScanObjectNN/
|-- main_split/
|-- training_objectdataset.h5
|-- test_objectdataset.h5
|-- training_objectdataset_augmentedrot_scale75.h5
|-- test_objectdataset_augmentedrot_scale75.h5
|-- main_split_nobg/
|-- training_objectdataset.h5
|-- test_objectdataset.h5
|-- ShapeNetPart/
|-- shapenetcore_partanno_segmentation_benchmark_v0_normal/
|-- 02691156/
|-- 1a04e3eab45ca15dd86060f189eb133.txt
|-- ...
|-- ...
│── train_test_split/
│── synsetoffset2category.txt
```
Usage
Pre-train on ShapeNet Dataset
python examples/classification/main.py --cfg cfgs/shapenet/BASEMODEL_pretrain.yaml
For example, to pre-train PointMLP model, just replace BASEMODEL with pointmlp:
python examples/classification/main.py --cfg cfgs/shapenet/pointmlp_pretrain.yaml
Finetune on Downstream Tasks
First modify the pretrained_path in finetune configs. Then run following command:
python examples/classification/main.py --cfg cfgs/DATASET/BASEMODEL_finetune.yaml
For example, to finetune PointMLP model on ScanObjectNN Hardest dataset, just replace DATASET with scanobjectnn and replace BASEMODEL with pointmlp:
python examples/classification/main.py --cfg cfgs/scanobjectnn/pointmlp_finetune.yaml
Results
Quantitative Results
Classification on ScanObjectNN
Point Model | TAP Pre-trained | OBJ_BG | OBJ_ONLY | PB_T50_RS |
---|---|---|---|---|
DGCNN | [Google / Tsinghua Cloud] | -- | -- | 86.6 [Google / Tsinghua Cloud] |
PointNet++ | [Google / Tsinghua Cloud] | -- | -- | 86.8 [Google / Tsinghua Cloud] |
PointMLP | [Google / Tsinghua Cloud] | -- | -- | 88.5 [Google / Tsinghua Cloud] |
Transformer | [Google / Tsinghua Cloud] | 90.4 [Google / Tsinghua Cloud] | 89.5 [Google / Tsinghua Cloud] | 85.7 [Google / Tsinghua Cloud] |
Part Segmentation on ShapeNetPart
Point Model | TAP Pre-trained | mIoU_C / mIoU_I |
---|---|---|
PointMLP | [Google / Tsinghua Cloud] | 85.2 / 86.9 [Google / Tsinghua Cloud] |
Citation
If you find our work useful in your research, please consider citing:
@article{wang2023tap,
title={Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models},
author={Wang, Ziyi and Yu, Xumin and Rao, Yongming and Zhou, Jie and Lu, Jiwen},
journal={arXiv preprint arXiv:2307.14971},
year={2023}
}
Acknowledgements
Our code is inspired by PointNeXt.