Awesome

Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models

Created by Ziyi Wang*, Xumin Yu*, Yongming Rao, Jie Zhou, Jiwen Lu.

This repository is a pyTorch implementation of our ICCV 2023 paper TAP (short for Take-A-Photo).

TAP is a generative pre-training method for any point cloud models. Given point cloud features extracted from backbone models, we generate view images from different instructed poses and calculate pixel-wise loss on image pixels as the pre-training scheme. Our pre-training method shows superior results on ScanObjectNN classification and ShapeNetPart segmentation than other generative pre-training methods based on mask modeling.

[arXiv][Project Page]

intro

Preparation

Installation Prerequisites

Python 3.7
CUDA 11.3
PyTorch 1.10.1
torch_scatter
open3d, einops, cv2
timm 0.5.4

conda create -n tap python=3.7 numpy=1.20 numba
conda activate tap
conda install -y pytorch=1.10.1 torchvision cudatoolkit=11.3 -c pytorch -c nvidia
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.10.1+cu113.html
pip install -r requirements_openpoints.txt
pip install open3d einops opencv-python
pip install timm==0.5.4

cd openpoints/cpp/pointnet2_batch
python setup.py install
cd ../pointops
python setup.py install
cd ../subsampling
python setup.py install
cd ../../..

Data Preparation

ShapeNet Dataset

Download the 12 views image dataset of ShapeNet from here. The images are rendered by MVCNN.
Download the point cloud dataset that is corresponding to the ShapeNet image dataset above from here(Google) or here(Tsinghua Cloud). The point clouds are sampled by us.
（Optional) Or you can download the 12 views image dataset with object mesh files of ShapeNet from here. Then you can sample point clouds by yourself from .off files via openpoints/dataset/sample_pc.py.

ScanObjectNN Dataset

Download the official ScanObjectNN dataset from here.

ShapeNetPart Dataset

Download the official ShapeNetPart dataset from here.

Data File Structure

```
TAP/
|-- data/
    |-- ShapeNet55/
        |-- poinyclouds
            |-- train/
                |-- model_000003.ply
                |-- ...
            |-- val/
                |-- model_000009.ply
                |-- ...
            |-- test/
                |-- model_000001.ply
                |-- ...
        |-- shapenet55v1/
            |-- train/
                |-- model_000003_001.jpg
                |-- ...
            |-- val/
                |-- model_000009_001.jpg
                |-- ...
            |-- test/
                |-- model_000001_001.jpg
                |-- ...
    |-- ScanObjectNN/
        |-- main_split/
            |-- training_objectdataset.h5
            |-- test_objectdataset.h5
            |-- training_objectdataset_augmentedrot_scale75.h5
            |-- test_objectdataset_augmentedrot_scale75.h5
        |-- main_split_nobg/
            |-- training_objectdataset.h5
            |-- test_objectdataset.h5
    ｜-- ShapeNetPart/
        |-- shapenetcore_partanno_segmentation_benchmark_v0_normal/
            |-- 02691156/
                |-- 1a04e3eab45ca15dd86060f189eb133.txt
                |-- ...
            |-- ...
            │── train_test_split/
            │── synsetoffset2category.txt

```

Usage

Pre-train on ShapeNet Dataset

python examples/classification/main.py --cfg cfgs/shapenet/BASEMODEL_pretrain.yaml

For example, to pre-train PointMLP model, just replace BASEMODEL with pointmlp:

python examples/classification/main.py --cfg cfgs/shapenet/pointmlp_pretrain.yaml

Finetune on Downstream Tasks

First modify the pretrained_path in finetune configs. Then run following command:

python examples/classification/main.py --cfg cfgs/DATASET/BASEMODEL_finetune.yaml

For example, to finetune PointMLP model on ScanObjectNN Hardest dataset, just replace DATASET with scanobjectnn and replace BASEMODEL with pointmlp:

python examples/classification/main.py --cfg cfgs/scanobjectnn/pointmlp_finetune.yaml

Results

Quantitative Results

Classification on ScanObjectNN

Point Model	TAP Pre-trained	OBJ_BG	OBJ_ONLY	PB_T50_RS
DGCNN	[Google / Tsinghua Cloud]	--	--	86.6 [Google / Tsinghua Cloud]
PointNet++	[Google / Tsinghua Cloud]	--	--	86.8 [Google / Tsinghua Cloud]
PointMLP	[Google / Tsinghua Cloud]	--	--	88.5 [Google / Tsinghua Cloud]
Transformer	[Google / Tsinghua Cloud]	90.4 [Google / Tsinghua Cloud]	89.5 [Google / Tsinghua Cloud]	85.7 [Google / Tsinghua Cloud]

Part Segmentation on ShapeNetPart

Point Model	TAP Pre-trained	mIoU_C / mIoU_I
PointMLP	[Google / Tsinghua Cloud]	85.2 / 86.9 [Google / Tsinghua Cloud]

Citation

If you find our work useful in your research, please consider citing:

@article{wang2023tap,
title={Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models},
author={Wang, Ziyi and Yu, Xumin and Rao, Yongming and Zhou, Jie and Lu, Jiwen},
journal={arXiv preprint arXiv:2307.14971},
year={2023}
}

Acknowledgements

Our code is inspired by PointNeXt.