Awesome
Interaction Compass: Multi-Label Zero-Shot Learning of Human-Object Interactions via Spatial Relations
Overview
This repository contains the implementation of Multi-Label Zero-Shot Learning of Human-Object Interactions via Spatial Relations.
In this work, we develope a compositional model to recognize unseen human interactions based on spatial relations between human and objects.
Prerequisites
To install all the dependency packages, please run:
pip install -r requirements.txt
Data Preparation
-
Please download and extract information into the
./data
folder. We include details about download links as well as what are they used for in each folder within `./data' folder. -
Please run feature extraction scripts in
./extract_feature
folder to extract features from the last convolution layers of ResNet as region features for the attention mechanism:
python ./extract_feature/hico_det/hico_det_extract_feature_map_ResNet_152_padding.py #create ./data/DeepFashion/feature_map_ResNet_101_DeepFashion_sep_seen_samples.hdf5
python ./extract_feature/visual_genome/visual_genome_extract_feature_map_ResNet_152_padding.py #create ./data/AWA2/feature_map_ResNet_101_AWA2.hdf5
as well as word embedding for zero-shot learning:
python ./extract_feature/hico_det/hico_extract_action_object_w2v.py #create ./data/CUB/feature_map_ResNet_101_CUB.hdf5
python ./extract_feature/visual_genome/visual_genome_extract_action_object_w2v.py #create ./data/SUN/feature_map_ResNet_101_SUN.hdf5
Experiments
- To train cross attention on HICO/VG datasets under different training splits (1A/1A2B), please run the following commands:
# HICO experiments
python ./experiments/hico_det/hico_det_pad_CrossAttention.py --partition train_1A --idx_GPU 1 --save_folder ./results/HICO_1A --mll_k_3 7 --mll_k_5 10 --loc_k 10 #1A setting
python ./experiments/hico_det/hico_det_pad_CrossAttention.py --partition train_1A2B --idx_GPU 2 --save_folder ./results/HICO_1A2B --mll_k_3 7 --mll_k_5 10 --loc_k 10 #1A2B setting
# Visual Genome experiments
python ./experiments/visual_genome_pad/VG_pad_CrossAttention.py --partition train_1A --idx_GPU 4 --save_folder ./results/VG_1A --mll_k_3 7 --mll_k_5 10 #1A setting
python ./experiments/visual_genome_pad/VG_pad_CrossAttention.py --partition train_1A2B --idx_GPU 5 --save_folder ./results/VG_1A2B --mll_k_3 7 --mll_k_5 10 #1A2B setting
Pretrained Models
For ease of reproducing the results, we provided the pretrained models for:
Dataset | Setting | Model |
---|---|---|
HICO | 1A2B | download |
HICO | 1A | download |
VisualGenome | 1A2B | download |
VisualGenome | 1A | download |
To evaluate pretrained model, please run:
python ./experiments/hico_det/hico_det_pad_CrossAttention.py --partition train_1A2B --idx_GPU 0 --save_folder ./results/HICO_1A2B --mll_k_3 7 --mll_k_5 10 --loc_k --load_model ./pretrained_model/model_final_HICO_1A2B.pt
where ./pretrained_model/model_final_HICO_1A2B.pt
is the path of the corresponding pretrained model.
Citation
If you find the project helpful, we would appreciate if you cite the works:
@article{Huynh:ICCV21,
author = {D.~Huynh and E.~Elhamifar},
title = {Interaction Compass: Multi-Label Zero-Shot Learning of Human-Object Interactions via Spatial Relations},
journal = {International Conference on Computer Vision},
year = {2021}}