Awesome
[WACV 2024] MotionAGFormer: Enhancing 3D Human Pose Estimation With a Transformer-GCNFormer Network
<a href="https://pytorch.org/get-started/locally/"><img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-ee4c2c?logo=pytorch&logoColor=white"></a> <a href="https://youtu.be/iyLhxPjwBuQ?si=yoG-wlz7N1fq-PmY"><img alt="Paper Explanation" src="https://img.shields.io/badge/-Paper Explanation in 9 Minutes-ea3323?logo=youtube"></a>
This is the official PyTorch implementation of the paper "MotionAGFormer: Enhancing 3D Human Pose Estimation With a Transformer-GCNFormer Network" (WACV 2024).
Environment
The project is developed under the following environment:
- Python 3.8.10
- PyTorch 2.0.0
- CUDA 12.2
For installation of the project dependencies, please run:
pip install -r requirements.txt
Dataset
Human3.6M
Preprocessing
- Download the fine-tuned Stacked Hourglass detections of MotionBERT's preprocessed H3.6M data here and unzip it to 'data/motion3d'.
- Slice the motion clips by running the following python code in
data/preprocess
directory:
For MotionAGFormer-Base and MotionAGFormer-Large:
python h36m.py --n-frames 243
For MotionAGFormer-Small:
python h36m.py --n-frames 81
For MotionAGFormer-XSmall:
python h36m.py --n-frames 27
Visualization
Run the following command in the data/preprocess
directory (it expects 243 frames):
python visualize.py --dataset h36m --sequence-number <AN ARBITRARY NUMBER>
This should create a gif file named h36m_pose<SEQ_NUMBER>.gif
within data
directory.
MPI-INF-3DHP
Preprocessing
Please refer to P-STMO for dataset setup. After preprocessing, the generated .npz files (data_train_3dhp.npz
and data_test_3dhp.npz
) should be located at data/motion3d
directory.
Visualization
Run it same as the visualization for Human3.6M, but --dataset
should be set to mpi
.
Training
After dataset preparation, you can train the model as follows:
Human3.6M
You can train Human3.6M with the following command:
python train.py --config <PATH-TO-CONFIG>
where config files are located at configs/h36m
. You can also use weight and biases for logging the training and validation error by adding --use-wandb
at the end. In case of using it, you can set the name using --wandb-name
. e.g.:
python train.py --config configs/h36m/MotionAGFormer-base.yaml --use-wandb --wandb-name MotionAGFormer-base
MPI-INF-3DHP
You can train MPI-INF-3DHP with the following command:
python train_3dhp.py --config <PATH-TO-CONFIG>
where config files are located at configs/mpi
. Like Human3.6M, weight and biases can be used.
Evaluation
Method | # frames | # Params | # MACs | H3.6M weights | MPI-INF-3DHP weights |
---|---|---|---|---|---|
MotionAGFormer-XS | 27 | 2.2M | 1.0G | download | download |
MotionAGFormer-S | 81 | 4.8M | 6.6G | download | download |
MotionAGFormer-B | 243 | 81 | 11.7M | 48.3G | 16G | download | download |
MotionAGFormer-L | 243 | 81 | 19.0M | 78.3G | 26G | download | download |
After downloading the weight from table above, you can evaluate Human3.6M models by:
python train.py --eval-only --checkpoint <CHECKPOINT-DIRECTORY> --checkpoint-file <CHECKPOINT-FILE-NAME> --config <PATH-TO-CONFIG>
For example if MotionAGFormer-L of H.36M is downloaded and put in checkpoint
directory, then we can run:
python train.py --eval-only --checkpoint checkpoint --checkpoint-file motionagformer-l-h36m.pth.tr --config configs/h36m/MotionAGFormer-large.yaml
Similarly, MPI-INF-3DHP can be evaluated as follows:
python train_3dhp.py --eval-only --checkpoint <CHECKPOINT-DIRECTORY> --checkpoint-file <CHECKPOINT-FILE-NAME> --config <PATH-TO-CONFIG>
Demo
Our demo is a modified version of the one provided by MHFormer repository. First, you need to download YOLOv3 and HRNet pretrained models here and put it in the './demo/lib/checkpoint' directory. Next, download our base model checkpoint from here and put it in the './checkpoint' directory. Then, you need to put your in-the-wild videos in the './demo/video' directory.
Run the command below:
python demo/vis.py --video sample_video.mp4
Sample demo output:
<p align="center"><img src="figure/sample_video.gif" width="60%" alt="" /></p>Acknowledgement
Our code refers to the following repositories:
We thank the authors for releasing their codes.
Citation
If you find our work useful for your project, please consider citing the paper:
@inproceedings{motionagformer2024,
title = {MotionAGFormer: Enhancing 3D Human Pose Estimation with a Transformer-GCNFormer Network},
author = {Soroush Mehraban, Vida Adeli, Babak Taati},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
year = {2024}
}