Home

Awesome

[WACV 2024] MotionAGFormer: Enhancing 3D Human Pose Estimation With a Transformer-GCNFormer Network

<a href="https://pytorch.org/get-started/locally/"><img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-ee4c2c?logo=pytorch&logoColor=white"></a> arXiv <a href="https://youtu.be/iyLhxPjwBuQ?si=yoG-wlz7N1fq-PmY"><img alt="Paper Explanation" src="https://img.shields.io/badge/-Paper Explanation in 9 Minutes-ea3323?logo=youtube"></a>

PWC PWC

This is the official PyTorch implementation of the paper "MotionAGFormer: Enhancing 3D Human Pose Estimation With a Transformer-GCNFormer Network" (WACV 2024).

Environment

The project is developed under the following environment:

For installation of the project dependencies, please run:

pip install -r requirements.txt

Dataset

Human3.6M

Preprocessing

  1. Download the fine-tuned Stacked Hourglass detections of MotionBERT's preprocessed H3.6M data here and unzip it to 'data/motion3d'.
  2. Slice the motion clips by running the following python code in data/preprocess directory:

For MotionAGFormer-Base and MotionAGFormer-Large:

python h36m.py  --n-frames 243

For MotionAGFormer-Small:

python h36m.py --n-frames 81

For MotionAGFormer-XSmall:

python h36m.py --n-frames 27

Visualization

Run the following command in the data/preprocess directory (it expects 243 frames):

python visualize.py --dataset h36m --sequence-number <AN ARBITRARY NUMBER>

This should create a gif file named h36m_pose<SEQ_NUMBER>.gif within data directory.

MPI-INF-3DHP

Preprocessing

Please refer to P-STMO for dataset setup. After preprocessing, the generated .npz files (data_train_3dhp.npz and data_test_3dhp.npz) should be located at data/motion3d directory.

Visualization

Run it same as the visualization for Human3.6M, but --dataset should be set to mpi.

Training

After dataset preparation, you can train the model as follows:

Human3.6M

You can train Human3.6M with the following command:

python train.py --config <PATH-TO-CONFIG>

where config files are located at configs/h36m. You can also use weight and biases for logging the training and validation error by adding --use-wandb at the end. In case of using it, you can set the name using --wandb-name. e.g.:

python train.py --config configs/h36m/MotionAGFormer-base.yaml --use-wandb --wandb-name MotionAGFormer-base

MPI-INF-3DHP

You can train MPI-INF-3DHP with the following command:

python train_3dhp.py --config <PATH-TO-CONFIG>

where config files are located at configs/mpi. Like Human3.6M, weight and biases can be used.

Evaluation

Method# frames# Params# MACsH3.6M weightsMPI-INF-3DHP weights
MotionAGFormer-XS272.2M1.0Gdownloaddownload
MotionAGFormer-S814.8M6.6Gdownloaddownload
MotionAGFormer-B243 | 8111.7M48.3G | 16Gdownloaddownload
MotionAGFormer-L243 | 8119.0M78.3G | 26Gdownloaddownload

After downloading the weight from table above, you can evaluate Human3.6M models by:

python train.py --eval-only --checkpoint <CHECKPOINT-DIRECTORY> --checkpoint-file <CHECKPOINT-FILE-NAME> --config <PATH-TO-CONFIG>

For example if MotionAGFormer-L of H.36M is downloaded and put in checkpoint directory, then we can run:

python train.py --eval-only --checkpoint checkpoint --checkpoint-file motionagformer-l-h36m.pth.tr --config configs/h36m/MotionAGFormer-large.yaml

Similarly, MPI-INF-3DHP can be evaluated as follows:

python train_3dhp.py --eval-only --checkpoint <CHECKPOINT-DIRECTORY> --checkpoint-file <CHECKPOINT-FILE-NAME> --config <PATH-TO-CONFIG>

Demo

Our demo is a modified version of the one provided by MHFormer repository. First, you need to download YOLOv3 and HRNet pretrained models here and put it in the './demo/lib/checkpoint' directory. Next, download our base model checkpoint from here and put it in the './checkpoint' directory. Then, you need to put your in-the-wild videos in the './demo/video' directory.

Run the command below:

python demo/vis.py --video sample_video.mp4

Sample demo output:

<p align="center"><img src="figure/sample_video.gif" width="60%" alt="" /></p>

Acknowledgement

Our code refers to the following repositories:

We thank the authors for releasing their codes.

Citation

If you find our work useful for your project, please consider citing the paper:

@inproceedings{motionagformer2024,
  title     =   {MotionAGFormer: Enhancing 3D Human Pose Estimation with a Transformer-GCNFormer Network}, 
  author    =   {Soroush Mehraban, Vida Adeli, Babak Taati},
  booktitle =   {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      =   {2024}
}