Home

Awesome

P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with Point-to-Pixel Prompting

PWC

Created by Ziyi Wang*, Xumin Yu*, Yongming Rao*, Jie Zhou, Jiwen Lu

This repository is an official implementation of P2P (NeurIPS 2022 Spotlight).

<div align=left> <img src='https://github.com/wangzy22/P2P/blob/master/fig/p2p.gif' width=340 height=200> </div>

P2P is a framework to leverage pre-trained image models for 3D analysis. We transform point clouds into colorful images with newly proposed geometry-preserved projection and geometry-aware coloring to adapt to pre-trained image models, whose weights are kept frozen during the end-to-end optimization of point cloud understanding tasks.

[arXiv][Project Page][Models]

intro

Preparation

Installation Prerequisites

conda create -n p2p python=3.9
conda activate p2p
conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3

mkdir lib
cd lib
git clone https://github.com/erikwijmans/Pointnet2_PyTorch.git
cd Pointnet2_PyTorch
pip install pointnet2_ops_lib/.
cd ../..

pip install torch-scatter -f https://data.pyg.org/whl/torch-1.11.0+cu113.html
pip install timm==0.5.4
pip install opencv-python
pip install scikit-learn, h5py
conda install pyyaml, tqdm

Data Preparation

Usage

Train

bash tool/train.sh EXP_NAME CONFIG_PATH DATASET

For example, to train P2P model with ConvNeXt-B-1k as base model on the ModelNet40 dataset:

bash tool/train.sh p2p_ConvNeXt-B-1k config/ModelNet40/p2p_ConvNeXt-B-1k.yaml ModelNet40

Test

bash tool/test.sh EXP_NAME CONFIG_PATH DATASET

For example, to test P2P model with ConvNeXt-B-1k as base model the ModelNet40 dataset:

bash tool/test.sh p2p_ConvNeXt-B-1k config/ModelNet40/p2p_ConvNeXt-B-1k.yaml ModelNet40

Reproduce

bash tool/reproduce.sh DATASET MODEL

For example, to reproduce results of P2P model with ConvNeXt-B-1k as base model on the ModelNet40 dataset with our provided checkpoint:

bash tool/reproduce.sh ModelNet40 ConvNeXt-B-1k

Results

Quantitative Results

We provide pretrained P2P models:

Image ModelModelNet Acc.ScanObjectNN Acc.
HorNet-L-22k-mlp94.0 [Google / Tsinghua Cloud]89.3 [Google / Tsinghua Cloud]
ResNet-1891.6 [Google / Tsinghua Cloud]82.6 [Google / Tsinghua Cloud]
ResNet-5092.5 [Google / Tsinghua Cloud]85.8 [Google / Tsinghua Cloud]
ResNet-10193.1 [Google / Tsinghua Cloud]87.4 [Google / Tsinghua Cloud]
ConvNeXt-T-1k92.6 [Google / Tsinghua Cloud]84.9 [Google / Tsinghua Cloud]
ConvNeXt-S-1k92.8 [Google / Tsinghua Cloud]85.3 [Google / Tsinghua Cloud]
ConvNeXt-B-1k93.0 [Google / Tsinghua Cloud]85.7 [Google / Tsinghua Cloud]
ConvNeXt-L-1k93.2 [Google / Tsinghua Cloud]86.2 [Google / Tsinghua Cloud]
ViT-T-1k91.5 [Google / Tsinghua Cloud]79.7 [Google / Tsinghua Cloud]
ViT-S-1k91.8 [Google / Tsinghua Cloud]81.6 [Google / Tsinghua Cloud]
ViT-B-1k92.7 [Google / Tsinghua Cloud]83.4 [Google / Tsinghua Cloud]
Swin-T-1k92.1 [Google / Tsinghua Cloud]82.9 [Google / Tsinghua Cloud]
Swin-S-1k92.5 [Google / Tsinghua Cloud]83.8 [Google / Tsinghua Cloud]
Swin-B-1k92.6 [Google / Tsinghua Cloud]84.6 [Google / Tsinghua Cloud]

To train our P2P framework with HorNet-L-22k as the image model, please download the pre-trained weight from the official HorNet repo. Please organize the directory as the following structure:

    P2P/
    |-- pretrained/
        |-- hornet_large_gf_in22k.pth
        |-- reproduce/
            |-- ckpt/
                |-- ModelNet40/
                    |-- ConvNeXt-B-1k-ModelNet40.pth
                |-- ScanObjectNN/
                    |-- ConvNeXt-B-1k-ScanObjectNN.pth

Visualization Results

intro

Citation

If you find our work useful in your research, please consider citing:

@article{wang2022p2p,
title={P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with Point-to-Pixel Prompting},
author={Wang, Ziyi and Yu, Xumin and Rao, Yongming and Zhou, Jie and Lu, Jiwen},
journal={arXiv preprint arXiv:2208.02812},
year={2022}
}

Acknowledgements

Our code is inspired by BPNet, Point-BERT.