Home

Awesome

headpose-fsanet-pytorch

Pytorch implementation of FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation from a Single Image<sup>2</sup>.

Demo

demo

Video file or a camera index can be provided to demo script. If no argument is provided, default camera index is used.

Video File Usage

For any video format that OpenCV supported (mp4, avi etc.):

python3 demo.py --video /path/to/video.mp4

Camera Usage

python3 demo.py --cam 0

Results

ModelDataset TypeYaw (MAE)Pitch (MAE)Roll (MAE)
FSA-Caps (1x1)14.856.274.96
FSA-Caps (Var)15.066.465.00
FSA-Caps (1x1 + Var)14.646.104.79

Note: My results are slightly worse than original author's results. For best results, please refer to official repository<sup>1</sup>.

Dependencies

Name                      Version 
python                    3.7.6
numpy                     1.18.5
opencv                    4.2.0
scipy                     1.5.0
matplotlib-base           3.2.2
pytorch                   1.5.1
torchvision               0.6.1
onnx                      1.7.0
onnxruntime               1.2.0

Installation with pip

pip3 install -r requirements.txt

You may also need to install jupyter to access notebooks (.ipynb). It is recommended that you use Anaconda to install packages.

Code has been tested on Ubuntu 18.04

Important Files Overview

Download Dataset

For model training and testing, download the preprocessed dataset from author's official git repository<sup>1</sup> and place them inside data/ directory. I am only using type1 data for training and testing. Your dataset hierarchy should look like:

data/
  type1/
    test/
      AFLW2000.npz
    train/
      AFW.npz
      AFW_Flip.npz
      HELEN.npz
      HELEN_Flip.npz
      IBUG.npz
      IBUG_Flip.npz
      LFPW.npz
      LFPW_Flip.npz

License

Copyright (c) 2020, Omar Hassan. (MIT License)

Acknowledgements

Special thanks to Mr. Tsun-Yi Yang for providing an excellent code to his paper. Please refer to the official repository to see detailed information and best results regarding the model:

[1] T. Yang, FSA-Net, (2019), GitHub repository

The models are trained and tested with various public datasets which have their own licenses. Please refer to them before using the code

References

[2] T. Yang, Y. Chen, Y. Lin and Y. Chuang, "FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation From a Single Image," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 1087-1096, doi: 10.1109/CVPR.2019.00118. IEEE-Xplore link

[3] Tal Hassner, Shai Harel, Eran Paz, and Roee Enbar. Effective face frontalization in unconstrained images. In CVPR, 2015

[4] Xiangyu Zhu, Zhen Lei, Junjie Yan, Dong Yi, and Stan Z. Li. High-fidelity pose and expression normalization for face recognition in the wild. In CVPR, 2015.