Awesome

Consistency-Aware Graph Network for Human Interaction Understanding

Update (2023.3.16)

Add experiments on synthetic dataset which has a greater set of action classes (100 action classes).See synthetic directory for details.
Construct Temporal Factor Graph Neural Network(TFGN) to leverage the temporal information in the videos. See track directory for details.

Introduction

Compared with the progress made on human activity classification, much less success has been achieved on human interaction understanding (HIU). Apart from the latter task is much more challenging, the main cause is that recent approaches learn human interactive relations via shallow graphical models, which is inadequate to model complicated human interactions. In this paper, we propose a consistency-aware graph network, which combines the representative ability of graph network and the consistencyaware reasoning to facilitate HIU. Our network consists of three components, a backbone CNN to extract image features, a factor graph network to learn third-order interactive relations among participants, and a consistency-aware reasoning module to enforce labeling and grouping consistencies. Our key observation is that the consistency-awarereasoning bias for HIU can be embedded into an energy, minimizing which delivers consistent predictions. An efficient mean-field inference algorithm is proposed, such that all modules of our network could be trained jointly in an end-to-end manner.

Requirements

python >= 3.6

pip install -r requirements.txt

Download

Here we provide two datasets including BIT and TVHI (their copyrights belong to the original authors). Along with the datasets, we also provide the pretrained basemodel weights and the final CAGNet model weights respectly. You can download them from

BIT dataset BaiduYun password: 4huw
TVHI dataset BaiduYun password: 0oia
CAGNet_bit BaiduYun password: 06j0
CAGNet_tvhi BaiduYun password: 3ii2
Basemodel_bit BaiduYun password: 6j9g
tracklets BaiduYun password: s5en

After downloading these assets, put the model weights in CAGNet/data and extract the datasets to CAGNet/data. The default filenames should work properly. The directory CAGNet/data looks like this

.
├── BIT
│   ├── BIT-anno
│   └── Bit-frames
├── bit.py
├── build_dataset.py
├── CAGNet_bit.pth
├── CAGNet_tvhi.pth
├── highfive
│   ├── frm
│   ├── readme.txt
│   └── tv_human_interaction_annotations
└── tvhi.py

Training

We offer the training code on BIT dataset. To train the model, you should download the pretrained basemodel and put it into data/ fold. Then run the program as follow:

cd cmd/
./train_bit.sh

Note that the default code is run on three GPUs, and you can adjust it in the scripts.

Evaluation

Here we provide evaluation results same as in the paper.

The Evaluation bash scripts are in cmd/.

You can validate the CAGNet model of BIT by

cd cmd/ 
./eval_bit

You can validate the CAGNet model of TVHI by

cd cmd/
./eval_tvhi

Acknowledgement

We implement the factor graph based on FGNN. We would like to express our sincere thanks to the contributors.

Citation

If you find the code useful, please consider citing

@InProceedings{Wang_2021_ICCV,
author = {Wang, Zhenhua and Meng, Jiajun and Guo, Dongyan and Zhang, Jianhua and Javen Shi and Chen, Shengyong},
title = {Consistency-Aware Graph Network for Human Interaction Understanding},
booktitle = {ICCV},
month = {Oct},
year = {2021}
}