Home

Awesome

[ECCV 2024] Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection

Arxiv, Project Page

Dataset

Follow the process of UPT.

The downloaded files should be placed as follows. Otherwise, please replace the default path to your custom locations.

|- CMMP
|   |- hicodet
|   |   |- hico_20160224_det
|   |       |- annotations
|   |       |- images
|   |- vcoco
|   |   |- mscoco2014
|   |       |- train2014
|   |       |-val2014
:   :      

Dependencies

  1. Follow the environment setup in UPT.

  2. Our code is built upon CLIP. Install the local package of CLIP:

cd CLIP && python setup.py develop && cd ..
  1. Download the CLIP weights to checkpoints/pretrained_clip.
|- CMMP
|   |- checkpoints
|   |   |- pretrained_clip
|   |       |- ViT-B-16.pt
|   |       |- ViT-L-14-336px.pt
:   :      
  1. Download the weights of DETR and put them in checkpoints/.
DatasetDETR weights
HICO-DETweights
V-COCOweights
|- CMMP
|   |- checkpoints
|   |   |- detr-r50-hicodet.pth
|   |   |- detr-r50-vcoco.pth
:   :   :

Pre-extracted Features

Download the pre-extracted features from HERE and the pre-extracted bboxes from HERE. The downloaded files have to be placed as follows.

|- CMMP
|   |- hicodet_pkl_files
|   |   |- union_embeddings_cachemodel_crop_padding_zeros_vitb16.p
|   |   |- hicodet_union_embeddings_cachemodel_crop_padding_zeros_vit336.p
|   |- vcoco_pkl_files
|   |   |- vcoco_union_embeddings_cachemodel_crop_padding_zeros_vit16.p
|   |   |- vcoco_union_embeddings_cachemodel_crop_padding_zeros_vit336.p
:   :      

Train/Test

Please follow the commands in ./scripts.

Model Zoo

MethodTypeUnseen↑Seen↑Full↑HM↑
CMMP (Ours)RF-UC29.4532.8732.1831.07
CMMP† (Ours)RF-UC35.9837.4237.1336.69
CMMP (Ours)NF-UC32.0929.7130.1830.85
CMMP† (Ours)NF-UC33.5235.5335.1334.50
CMMP (Ours)UO33.7631.1531.5932.40
CMMP† (Ours)UO39.6736.1536.7437.83
CMMP (Ours)UV26.2332.7531.8429.13
CMMP† (Ours)UV30.8437.2836.3833.75

Model Weights

You can download the model weights from:

Link: https://pan.baidu.com/s/1XyWG2qjEXWghEYcc4-PGFA?pwd=zkh5
Password: zkh5

Or you can download the CMMP weights from huggingface:

https://huggingface.co/lttt/CMMP/tree/main

Citation

If you find our paper and/or code helpful, please consider citing:

@article{ting2024CMMP,
  title={Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection},
  author={Ting Lei and Shaofeng Yin and Yuxin Peng and Yang Liu},
  year={2024},
  booktitle={ECCV},
  organization={IEEE},
}