Awesome

P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with Point-to-Pixel Prompting

Created by Ziyi Wang*, Xumin Yu*, Yongming Rao*, Jie Zhou, Jiwen Lu

This repository is an official implementation of P2P (NeurIPS 2022 Spotlight).

P2P is a framework to leverage pre-trained image models for 3D analysis. We transform point clouds into colorful images with newly proposed geometry-preserved projection and geometry-aware coloring to adapt to pre-trained image models, whose weights are kept frozen during the end-to-end optimization of point cloud understanding tasks.

[arXiv][Project Page][Models]

intro

Preparation

Installation Prerequisites

Python 3.9
CUDA 11.3
PyTorch 1.11.1
timm 0.5.4
torch_scatter
pointnet2_ops
cv2, sklearn, yaml, h5py

conda create -n p2p python=3.9
conda activate p2p
conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3

mkdir lib
cd lib
git clone https://github.com/erikwijmans/Pointnet2_PyTorch.git
cd Pointnet2_PyTorch
pip install pointnet2_ops_lib/.
cd ../..

pip install torch-scatter -f https://data.pyg.org/whl/torch-1.11.0+cu113.html
pip install timm==0.5.4
pip install opencv-python
pip install scikit-learn, h5py
conda install pyyaml, tqdm

Data Preparation

Download the processed ModelNet40 dataset from [Google Drive][Tsinghua Cloud][BaiDuYun](code:4u1e). Or you can download the offical ModelNet from here, and process it by yourself.
Download the official ScanObjectNN dataset from here.

The data is expected to be in the following file structure:

P2P/
|-- config/
|-- data/
    |-- ModelNet40/
        |-- modelnet40_shape_names.txt
        |-- modelnet_train.txt
        |-- modelnet_test.txt
        |-- modelnet40_train_8192pts_fps.dat
        |-- modelnet40_test_8192pts_fps.dat
    |-- ScanObjectNN/
        |-- main_split/
            |-- training_objectdataset_augmentedrot_scale75.h5
            |-- test_objectdataset_augmentedrot_scale75.h5
|-- dataset/

Usage

Train

bash tool/train.sh EXP_NAME CONFIG_PATH DATASET

For example, to train P2P model with ConvNeXt-B-1k as base model on the ModelNet40 dataset:

bash tool/train.sh p2p_ConvNeXt-B-1k config/ModelNet40/p2p_ConvNeXt-B-1k.yaml ModelNet40

Test

bash tool/test.sh EXP_NAME CONFIG_PATH DATASET

For example, to test P2P model with ConvNeXt-B-1k as base model the ModelNet40 dataset:

bash tool/test.sh p2p_ConvNeXt-B-1k config/ModelNet40/p2p_ConvNeXt-B-1k.yaml ModelNet40

Reproduce

bash tool/reproduce.sh DATASET MODEL

For example, to reproduce results of P2P model with ConvNeXt-B-1k as base model on the ModelNet40 dataset with our provided checkpoint:

bash tool/reproduce.sh ModelNet40 ConvNeXt-B-1k

Results

Quantitative Results

We provide pretrained P2P models:

Image Model	ModelNet Acc.	ScanObjectNN Acc.
HorNet-L-22k-mlp	94.0 [Google / Tsinghua Cloud]	89.3 [Google / Tsinghua Cloud]
ResNet-18	91.6 [Google / Tsinghua Cloud]	82.6 [Google / Tsinghua Cloud]
ResNet-50	92.5 [Google / Tsinghua Cloud]	85.8 [Google / Tsinghua Cloud]
ResNet-101	93.1 [Google / Tsinghua Cloud]	87.4 [Google / Tsinghua Cloud]
ConvNeXt-T-1k	92.6 [Google / Tsinghua Cloud]	84.9 [Google / Tsinghua Cloud]
ConvNeXt-S-1k	92.8 [Google / Tsinghua Cloud]	85.3 [Google / Tsinghua Cloud]
ConvNeXt-B-1k	93.0 [Google / Tsinghua Cloud]	85.7 [Google / Tsinghua Cloud]
ConvNeXt-L-1k	93.2 [Google / Tsinghua Cloud]	86.2 [Google / Tsinghua Cloud]
ViT-T-1k	91.5 [Google / Tsinghua Cloud]	79.7 [Google / Tsinghua Cloud]
ViT-S-1k	91.8 [Google / Tsinghua Cloud]	81.6 [Google / Tsinghua Cloud]
ViT-B-1k	92.7 [Google / Tsinghua Cloud]	83.4 [Google / Tsinghua Cloud]
Swin-T-1k	92.1 [Google / Tsinghua Cloud]	82.9 [Google / Tsinghua Cloud]
Swin-S-1k	92.5 [Google / Tsinghua Cloud]	83.8 [Google / Tsinghua Cloud]
Swin-B-1k	92.6 [Google / Tsinghua Cloud]	84.6 [Google / Tsinghua Cloud]

To train our P2P framework with HorNet-L-22k as the image model, please download the pre-trained weight from the official HorNet repo. Please organize the directory as the following structure:

    P2P/
    |-- pretrained/
        |-- hornet_large_gf_in22k.pth
        |-- reproduce/
            |-- ckpt/
                |-- ModelNet40/
                    |-- ConvNeXt-B-1k-ModelNet40.pth
                |-- ScanObjectNN/
                    |-- ConvNeXt-B-1k-ScanObjectNN.pth

Visualization Results

intro

Citation

If you find our work useful in your research, please consider citing:

@article{wang2022p2p,
title={P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with Point-to-Pixel Prompting},
author={Wang, Ziyi and Yu, Xumin and Rao, Yongming and Zhou, Jie and Lu, Jiwen},
journal={arXiv preprint arXiv:2208.02812},
year={2022}
}

Acknowledgements

Our code is inspired by BPNet, Point-BERT.