Home

Awesome

Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models

PWC PWC

Created by Ziyi Wang*, Xumin Yu*, Yongming Rao, Jie Zhou, Jiwen Lu.

This repository is a pyTorch implementation of our ICCV 2023 paper TAP (short for Take-A-Photo).

TAP is a generative pre-training method for any point cloud models. Given point cloud features extracted from backbone models, we generate view images from different instructed poses and calculate pixel-wise loss on image pixels as the pre-training scheme. Our pre-training method shows superior results on ScanObjectNN classification and ShapeNetPart segmentation than other generative pre-training methods based on mask modeling.

[arXiv][Project Page]

intro

Preparation

Installation Prerequisites

conda create -n tap python=3.7 numpy=1.20 numba
conda activate tap
conda install -y pytorch=1.10.1 torchvision cudatoolkit=11.3 -c pytorch -c nvidia
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.10.1+cu113.html
pip install -r requirements_openpoints.txt
pip install open3d einops opencv-python
pip install timm==0.5.4

cd openpoints/cpp/pointnet2_batch
python setup.py install
cd ../pointops
python setup.py install
cd ../subsampling
python setup.py install
cd ../../..

Data Preparation

ShapeNet Dataset

ScanObjectNN Dataset

ShapeNetPart Dataset

Data File Structure

```
TAP/
|-- data/
    |-- ShapeNet55/
        |-- poinyclouds
            |-- train/
                |-- model_000003.ply
                |-- ...
            |-- val/
                |-- model_000009.ply
                |-- ...
            |-- test/
                |-- model_000001.ply
                |-- ...
        |-- shapenet55v1/
            |-- train/
                |-- model_000003_001.jpg
                |-- ...
            |-- val/
                |-- model_000009_001.jpg
                |-- ...
            |-- test/
                |-- model_000001_001.jpg
                |-- ...
    |-- ScanObjectNN/
        |-- main_split/
            |-- training_objectdataset.h5
            |-- test_objectdataset.h5
            |-- training_objectdataset_augmentedrot_scale75.h5
            |-- test_objectdataset_augmentedrot_scale75.h5
        |-- main_split_nobg/
            |-- training_objectdataset.h5
            |-- test_objectdataset.h5
    |-- ShapeNetPart/
        |-- shapenetcore_partanno_segmentation_benchmark_v0_normal/
            |-- 02691156/
                |-- 1a04e3eab45ca15dd86060f189eb133.txt
                |-- ...
            |-- ...
            │── train_test_split/
            │── synsetoffset2category.txt

```

Usage

Pre-train on ShapeNet Dataset

python examples/classification/main.py --cfg cfgs/shapenet/BASEMODEL_pretrain.yaml

For example, to pre-train PointMLP model, just replace BASEMODEL with pointmlp:

python examples/classification/main.py --cfg cfgs/shapenet/pointmlp_pretrain.yaml

Finetune on Downstream Tasks

First modify the pretrained_path in finetune configs. Then run following command:

python examples/classification/main.py --cfg cfgs/DATASET/BASEMODEL_finetune.yaml

For example, to finetune PointMLP model on ScanObjectNN Hardest dataset, just replace DATASET with scanobjectnn and replace BASEMODEL with pointmlp:

python examples/classification/main.py --cfg cfgs/scanobjectnn/pointmlp_finetune.yaml

Results

Quantitative Results

Classification on ScanObjectNN

Point ModelTAP Pre-trainedOBJ_BGOBJ_ONLYPB_T50_RS
DGCNN[Google / Tsinghua Cloud]----86.6 [Google / Tsinghua Cloud]
PointNet++[Google / Tsinghua Cloud]----86.8 [Google / Tsinghua Cloud]
PointMLP[Google / Tsinghua Cloud]----88.5 [Google / Tsinghua Cloud]
Transformer[Google / Tsinghua Cloud]90.4 [Google / Tsinghua Cloud]89.5 [Google / Tsinghua Cloud]85.7 [Google / Tsinghua Cloud]

Part Segmentation on ShapeNetPart

Point ModelTAP Pre-trainedmIoU_C / mIoU_I
PointMLP[Google / Tsinghua Cloud]85.2 / 86.9 [Google / Tsinghua Cloud]

Citation

If you find our work useful in your research, please consider citing:

@article{wang2023tap,
title={Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models},
author={Wang, Ziyi and Yu, Xumin and Rao, Yongming and Zhou, Jie and Lu, Jiwen},
journal={arXiv preprint arXiv:2307.14971},
year={2023}
}

Acknowledgements

Our code is inspired by PointNeXt.