Home

Awesome

SymNet

As a part of HAKE project (HAKE-Object).

News: (2022.12.19) HAKE 2.0 is accepted by TPAMI!

(2022.12.7) We release a new project OCL (paper). Data and code are coming soon.

(2022.11.19) We release the interactive object bounding boxes & classes in the interactions within AVA dataset (2.1 & 2.2)! HAKE-AVA, [Paper].

(2022.03.28) We release the code of multiple attribute recognition mentioned in PAMI version

(2022.02.14) We release the human body part state labels based on AVA: HAKE-AVA.

(2021.10.06) Our extended version of SymNet is accepted by TPAMI! Paper and code are coming soon.

(2021.2.7) Upgraded HAKE-Activity2Vec is released! Images/Videos --> human box + ID + skeleton + part states + action + representation. [Description], Full demo: [YouTube], [bilibili]

<!-- (2020.10.27) The code of [IDN](https://github.com/DirtyHarryLYL/HAKE-Action-Torch/tree/IDN-(Integrating-Decomposing-Network)) ([Paper](https://arxiv.org/abs/2010.16219)) in NeurIPS'20 is released! -->

(2020.6.16) Our larger version HAKE-Large (>120K images, activity and part state labels) is released!

This is the code accompanying our CVPR'20 and TPAMI'21 papers: Symmetry and Group in Attribute-Object Compositions report, Learning Single/Multi-Attribute of Object with Symmetry and Group report

<!-- **Symmetry and Group in Attribute-Object Compositions**. [[arXiv](https://arxiv.org/abs/2004.00587)] *[Yong-Lu Li](https://dirtyharrylyl.github.io/), [Yue Xu](https://silicx.github.io/), Xiaohan Mao, [Cewu Lu](http://mvig.sjtu.edu.cn/)* **Learning Single/Multi-Attribute of Object with Symmetry and Group**. [[arXiv](https://arxiv.org/abs/2110.04603)] *[Yong-Lu Li](https://dirtyharrylyl.github.io/), [Yue Xu](https://silicx.github.io/), [Xinyu Xu](https://xuxinyu.website) ,Xiaohan Mao, [Cewu Lu](http://mvig.sjtu.edu.cn/)* -->

Overview

If you find this repository useful for you, please consider citing our paper.

---SymNet-PAMI
@article{li2021learning,
  title={Learning Single/Multi-Attribute of Object with Symmetry and Group},
  author={Li, Yong-Lu and Xu, Yue and Xu, Xinyu and Mao, Xiaohan and Lu, Cewu},
  journal={TPAMI},
  year={2021}
}
---SymNet-CVPR
@inproceedings{li2020symmetry,
	title={Symmetry and Group in Attribute-Object Compositions},
	author={Li, Yong-Lu and Xu, Yue and Mao, Xiaohan and Lu, Cewu},
	booktitle={CVPR},
	year={2020}
}

Prerequisites

Packages: Install using pip install -r requirements.txt

Datasets: Download and re-arrange with:

cd data; bash download_data.sh

Features and pretrained models: Features for compositional ZSL (CZSL) setting<sup>[1]</sup> will be downloaded together with the datasets. Features for generalized compositional ZSL (GCZSL) setting<sup>[2]</sup> can be extracted using:

python utils/dataset/GCZSL_dataset.py [MIT/UT]

For multiple attribute recognition, we re-organize the metadata of aPY/SUN datasets with pre-extracted ResNet-50 feature in 4 files {APY/SUN}_{train/test}.pkl. You can download them from Link and put them into ./data folder.

Pretrained models and intermediate results can be downloaded from here: Link. Please unzip the obj_scores.zip to ./data/obj_scores and weights.zip to ./weights.

Compositional Zero-shot Leaning (CZSL)

These are commands for the split and evaluation metrics introduced by [1].

Training a object classifier

Before training a SymNet model, train an object classifier by running:

python run_symnet.py --network fc_obj --name MIT_obj_lr3e-3 --data MIT --epoch 1500 --batchnorm --lr 3e-3
python run_symnet.py --network fc_obj --name UT_obj_lr1e-3 --data UT --epoch 300 --batchnorm --lr 1e-3

Then store the intermediate object results:

python test_obj.py --network fc_obj --name MIT_obj_lr3e-3 --data MIT --epoch 1120 --batchnorm
python test_obj.py --network fc_obj --name UT_obj_lr1e-3 --data UT --epoch 140 --batchnorm

The results file will be stored in ./data/obj_scores with names MIT_obj_lr3e-3_ep1120.pkl and UT_obj_lr1e-3_ep140.pkl (in the examples above).

Training a SymNet

To train a SymNet with the hyper-parameters in our paper, run:

python run_symnet.py --name MIT_best --data MIT --epoch 400 --obj_pred MIT_obj_lr3e-3_ep1120.pkl --batchnorm --lr 5e-4 --bz 512 --lambda_cls_attr 1 --lambda_cls_obj 0.01 --lambda_trip 0.03 --lambda_sym 0.05 --lambda_axiom 0.01
python run_symnet.py --name UT_best --data UT --epoch 700 --obj_pred UT_obj_lr1e-3_ep140.pkl --batchnorm  --wordvec onehot  --lr 1e-4 --bz 256 --lambda_cls_attr 1 --lambda_cls_obj 0.5 --lambda_trip 0.5 --lambda_sym 0.01 --lambda_axiom 0.03

Model Evaluation

python test_symnet.py --name MIT_best --data MIT --epoch 320 --obj_pred MIT_lr3e-3_ep1120.pkl --batchnorm
python test_symnet.py --name UT_best --data UT --epoch 600 --obj_pred UT_lr1e-3_ep140.pkl --wordvec onehot --batchnorm
MethodMIT (top-1)MIT (top-2)MIT (top-2)UT (top-1)UT (top-2)UT (top-3)
Visual Product9.8/13.916.120.649.9//
LabelEmbed (LE)11.2/13.417.622.425.8//
~- LEOR4.56.211.8///
~- LE + R9.316.320.8///
~- LabelEmbed+14.8*//37.4//
AnalogousAttr1.4//18.3//
Red Wine13.121.227.640.3//
AttOperator14.219.625.146.256.669.2
TAFE-Net16.426.433.033.2//
GenModel17.8//48.3//
SymNet (Ours)19.928.233.852.167.876.0

Generalized Compositional Zero-shot Leaning (GCZSL)

These are commands for the split and evaluation metrics introduced by [2].

Training a object classifier

python run_symnet.py --network fc_obj --data MITg --name MITg_obj_lr3e-3 --bz 2048 --test_bz 2048  --lr 3e-3 --epoch 1000 --batchnorm --fc_cls 1024

python run_symnet.py --network fc_obj --data UTg --name UTg_obj_lr1e-3 --bz 2048 --test_bz 2048 --lr 1e-3 --epoch 700 --batchnorm  --fc_cls 1024			

To store the object classification results of both valid and test set, run:

python test_obj.py --network fc_obj --data MITg --name MITg_obj_lr3e-3 --bz 2048 --test_bz 2048  --epoch 980 --batchnorm --fc_cls 1024 --test_set val
python test_obj.py --network fc_obj --data MITg --name MITg_obj_lr3e-3 --bz 2048 --test_bz 2048  --epoch 980 --batchnorm --fc_cls 1024 --test_set test

python test_obj.py --network fc_obj --data UTg --name UTg_obj_lr1e-3 --bz 2048 --test_bz 2048 --epoch 660 --batchnorm  --fc_cls 1024 --test_set val
python test_obj.py --network fc_obj --data UTg --name UTg_obj_lr1e-3 --bz 2048 --test_bz 2048 --epoch 660 --batchnorm  --fc_cls 1024 --test_set test

Trainig a SymNet

To train a SymNet for GCZSL, run:

python run_symnet_gczsl.py --data MITg --name MITg_best --epoch 1000 --obj_pred MITg_obj_lr3e-3_val_ep980.pkl --test_set val --lr 3e-4 --bz 512 --test_bz 512 --batchnorm  --lambda_cls_attr 1 --lambda_cls_obj 0.01 --lambda_trip 1 --lambda_sym 0.02 --lambda_axiom 0.02 --triplet_margin 0.3

python run_symnet_gczsl.py --data UTg --name UTg_best --epoch 300 --obj_pred UTg_obj_lr1e-3_val_ep660.pkl --test_set val --lr 1e-3 --bz 512 --test_bz 512 --wordvec onehot --batchnorm --lambda_cls_attr 1 --lambda_cls_obj 0.01 --fc_compress 512 --lambda_trip 1 --lambda_sym 0.02 --lambda_axiom 0.01

Model Evaluation

python test_symnet_gczsl.py --data MITg --name MITg_best --epoch 1000 --obj_pred MITg_obj_lr3e-3_test_ep980.pkl --bz 512 --test_bz 512 --batchnorm  --triplet_margin 0.3 --test_set test --topk 1
python test_symnet_gczsl.py --data MITg --name MITg_best --epoch 1000 --obj_pred MITg_obj_lr3e-3_val_ep980.pkl --bz 512 --test_bz 512 --batchnorm  --triplet_margin 0.3 --test_set val --topk 1

python test_symnet_gczsl.py --data UTg --name UTg_best --epoch 290 --obj_pred UTg_obj_lr1e-3_test_ep660.pkl --bz 512 --test_bz 512 --batchnorm --wordvec onehot --fc_compress 512 --test_set test --topk 1
python test_symnet_gczsl.py --data UTg --name UTg_best --epoch 290 --obj_pred UTg_obj_lr1e-3_val_ep660.pkl --bz 512 --test_bz 512 --batchnorm --wordvec onehot --fc_compress 512 --test_set val --topk 1

MIT-States evaluation results (with metrics of TMN<sup>[2]</sup>)

ModelVal Top-1 AUCVal Top-2 AUCVal Top-3 AUCTest Top-1 AUCTest Top-2 AUCTest Top-3 AUCSeenUnseenHM
AttOperator2.56.210.11.64.77.614.317.49.9
Red Wine2.97.311.82.45.79.320.717.911.6
LabelEmbed+3.07.612.22.05.69.415.020.110.7
GenModel3.16.910.52.35.78.824.813.411.2
TMN3.58.112.42.97.111.520.220.113.0
SymNet (CVPR)4.39.814.83.07.612.324.425.216.1
SymNet (TPAMI)5.411.616.64.510.115.026.226.316.8
SymNet (Latest Update)5.812.217.85.311.316.529.526.117.4

UT-Zappos evaluation results (with metrics of CAUSAL<sup>[3]</sup>)

ModelUnseenSeenHarmonicClosedAUC
LabelEmbed16.253.024.759.322.9
AttOperator25.537.927.954.022.1
TMN10.354.317.462.025.4
CAUSAL28.037.030.658.626.4
SymNet (Ours)10.356.324.158.726.8

Multiple Attribute Recognition

Trainig a SymNet

To train a SymNet for multiple attribute recognition, run:

python run_symnet_multi.py --name APY_best --data APY --rmd_metric sigmoid --fc_compress 256 --rep_dim 128  --test_freq 1  --epoch 100 --batchnorm --lr 3e-3 --bz 128 --lambda_cls_attr 1 --lambda_trip 1 --lambda_sym 5e-2 --bce_neg_weight 0.05 --lambda_cls_obj 5e-2 --lambda_axiom 1e-3  --lambda_multi_rmd 5e-2  --lambda_atten 1

python run_symnet_multi.py --name SUN_best --data SUN --rmd_metric rmd --fc_compress 1536 --rep_dim 128 --test_freq 5 --epoch 150 --batchnorm --lr 5e-3 --bz 128  --lambda_cls_attr 1 --lambda_trip 5e-2 --lambda_sym 8e-3 --bce_neg_weight 0.4 --lambda_cls_obj 3e-1 --lambda_axiom 1e-3 --lambda_multi_rmd 6e-2 --lambda_atten 6e-1

Model Evaluation

python test_symnet_multi.py --data APY --name APY_best --epoch 78 --batchnorm --rep_dim 128 --fc_compress 256
python test_symnet_multi.py --data SUN --name SUN_best --epoch 95 --batchnorm --rep_dim 128 --fc_compress 1536

Evaluation results on aPY and SUN (with metrics of mAUC)

ModelaPYSUN
ALE69.274.5
HAP58.276.7
UDICA82.385.8
KDICA84.7/
UMF79.780.5
AMT84.582.5
FMT70.575.5
GALM84.286.5
SymNet (Ours)86.188.4

Tips

Use Customized Dataset

Take UT as example, beside reorganizing the images to data/ut-zap50k-original/images/[attribute]_[object]/:

<!-- ## TODO - [ ] Unified backbone - [ ] Tips for hyperparameters and tuning - [ ] Some possible tricks - [ ] New module for multi-label attribute recognition - [ ] Torch version -->

Acknowledgement

The dataloader and evaluation code are based on Attributes as Operators<sup>[1]</sup> and Task-Driven Modular Networks<sup>[2]</sup>.

Reference

[1] Attributes as Operators: Factorizing Unseen Attribute-Object Compositions

[2] Task-Driven Modular Networks for Zero-Shot Compositional Learning

[3] A causal view of compositional zero-shot recognition