Awesome
MVP-N: A Dataset and Benchmark for Real-World Multi-View Object Classification (NeurIPS 2022) [Paper] [Reviews]
Towards Real-World Multi-View Object Classification: Dataset, Benchmark, and Analysis (TCSVT 2024) [Paper] [Reviews]
This is the official PyTorch implementation.
Create issues from this repository
Please contact us at wangren@snu.ac.kr. We will reply the issue within 3 days.
Notice Board
- Related research work published in 2024.01 ~ 2024.06 is summarized. Five papers are added to the list below.
- The usage of FG3D dataset (TIP 2021) is supported in this repository.
- VSFormer (TVCG 2024) is added to this benchmark.
- Related research work published in 2023 is summarized. One paper is added to the list below.
- Related multi-view-based feature aggregation methods for biomedical tasks will not be summarized here.
- New hypergraph-based methods and soft label methods published from 2023 will no longer be summarized here unless explicitly designed for multi-view object classification.
Clarification
- Discussion on the usage of
allow_tf32
.torch.backends.cuda.matmul.allow_tf32 = False torch.backends.cudnn.allow_tf32 = False
- Except for CVR (ICCV 2021), the performance of other methods is almost unaffected.
- The performance change for CVR cannot be neglected and may affect its best configuration.
- We recommend trying the following configurations to get the best one when doing custom training on the MVP-N dataset.
-CVR_K=2 -CVR_LAMBDA=1 -CVR_K=3 -CVR_LAMBDA=0.5 -CVR_K=3 -CVR_LAMBDA=1
- Modifications in the summary file compared to the paper content.
- The codes of MVT (BMVC 2021) and iMHL (TIP 2018) are released.
- MVT (BMVC 2021) satisfies P1 by analyzing its open-source implementation.
- There is a typo in the caption of Table 4 (NeurIPS 2022), which should be corrected as 'Backbone (ResNet-18): 11.20 M, 10.91 G, and 6.19 ± 0.05 ms'.
- There is a typo in the second row of TABLE VII (TCSVT 2024), which should be corrected as '97.97'.
Summary of 56 multi-view-based feature aggregation methods [Details]
Period: 2015.01 ~ 2024.06
Conferences: NeurIPS, ICLR, ICML, CVPR, ICCV, ECCV, AAAI, IJCAI, MM, WACV, BMVC, ACCV
Journals: TPAMI, IJCV, TIP, TNNLS, TMM, TCSVT, TVCG, PR
Workshops: NeurIPS, ICLR, ICML, CVPR, ICCV, ECCV
Year | Conferences | Journals | Workshops |
---|---|---|---|
2015 | 1 | 0 | 0 |
2016 | 2 | 0 | 0 |
2017 | 2 | 1 | 0 |
2018 | 5 | 5 | 1 |
2019 | 7 | 4 | 0 |
2020 | 4 | 2 | 0 |
2021 | 4 | 6 | 0 |
2022 | 2 | 5 | 2 |
2023 | 0 | 1 | 0 |
2024 | 1 | 4 | 0 |
Environment
Ubuntu 20.04.3 LTS
Python 3.8.10
CUDA 11.1
cuDNN 8
NVIDIA GeForce RTX 3090 GPU
Intel(R) Core(TM) i9-10900X CPU @ 3.70GHz
Setup
Step 1: Get repository
git clone https://github.com/SMNUResearch/MVP-N.git
cd MVP-N
Step 2: Install dependencies
sh install.sh
Dataset Preparation
Step 1: Get dataset documentation from [Google Drive]
Step 2: Download data.zip from [Google Drive]
Step 3: Place data.zip in this repository
Step 4: Unzip data.zip
unzip data.zip
Quick Test
Step 1: Download pretrained weights from [Google Drive]
Step 2: Place weights.zip in this repository
Step 3: Unzip weights.zip
unzip weights.zip -d weights
Step 4: Evaluation
# feature aggregation performance
python3 main_multi_view.py -MV_FLAG=TEST -MV_TYPE=DAN -MV_TEST_WEIGHT=./weights/DAN.pt
# confusion matrix
python3 main_multi_view.py -MV_FLAG=CM -MV_TYPE=DAN -MV_TEST_WEIGHT=./weights/DAN.pt
# computational efficiency
python3 main_multi_view.py -MV_FLAG=COMPUTATION -MV_TYPE=DAN
# soft label performance
python3 main_single_view.py -SV_FLAG=TEST -SV_TYPE=SAT -SV_TEST_WEIGHT=./weights/SAT.pt
Training
Training with default configurations
# feature aggregation
python3 main_multi_view.py -MV_FLAG=TRAIN -MV_TYPE=DAN
python3 main_multi_view.py -MV_FLAG=TRAIN -MV_TYPE=CVR
python3 main_multi_view.py -MV_FLAG=TRAIN -MV_TYPE=SMVCNN -SMVCNN_USE_EMBED
# soft label
python3 main_single_view.py -SV_FLAG=TRAIN -SV_TYPE=KD
python3 main_single_view.py -SV_FLAG=TRAIN -SV_TYPE=HPIQ
python3 main_single_view.py -SV_FLAG=TRAIN -SV_TYPE=HS
Training with other configurations
# feature aggregation
python3 main_multi_view.py -MV_FLAG=TRAIN -MV_TYPE=CVR -CVR_LAMBDA=0.5 -CVR_K=3
# soft label
python3 main_single_view.py -SV_FLAG=TRAIN -SV_TYPE=KD -KD_T=3
Details of configurations are provided in config/base.yaml
Training (FG3D)
Step 1: Download FG3D.zip from [Google Drive]
Step 2: Place FG3D.zip in this repository
Step 3: Unzip FG3D.zip
unzip FG3D.zip
Step 4: Training with default configurations. Details are provided in config/FG3D.yaml
python3 main_multi_view_FG3D.py -MV_FLAG=TRAIN -MV_TYPE=DAN -NUM_CLASSES=13 -CLASSES=Airplane
python3 main_multi_view_FG3D.py -MV_FLAG=TRAIN -MV_TYPE=SMVCNN -NUM_CLASSES=20 -CLASSES=Car
python3 main_multi_view_FG3D.py -MV_FLAG=TRAIN -MV_TYPE=VSF -NUM_CLASSES=33 -CLASSES=Chair