Home

Awesome

Robust 2D and 3D Face Alignment Implemented in MXNet

This repository contains several heatmap based approaches like stacked Hourglass and stacked Scale Aggregation Topology (SAT) for robust 2D and 3D face alignment. Some popular blocks such as bottleneck residual block, inception residual block, parallel and multi-scale (HPM) residual block and channel aggregation block (CAB) are also provided for building the topology of the deep face alignment network. All the codes in this repo are implemented in Python and MXNet.

The models for 2D face alignment are verified on IBUG, COFW and 300W test datasets by the normalised mean error (NME) respectively. For 3D face alignment, the 3D pre-trained models are compared on AFLW2000-3D with the most recent state-of-the-art methods.

The training/validation dataset and testset are in below table:

DataDownload LinkDescription
train_testset2d.zipBaiduCloud or GoogleDrive, 490M2D training/validation dataset and IBUG, COFW, 300W testset
train_testset3d.zipBaiduCloud or GoogleDrive, 1.54G3D training/validation dataset and AFLW2000-3D testset

The performances of 2D pre-trained models are shown below. Accuracy is reported as the Normalised Mean Error (NME). To facilitate comparison with other methods on these datasets, we give mean error normalised by the eye centre distance. Each training model is denoted by Topology^StackBlock (d = DownSampling Steps) - BlockType - OtherParameters.

ModelModel SizeIBUGCOFW300WDownload Link
Hourglass2(d=4)-Resnet26MB7.7196.7766.482BaiduCloud or GoogleDrive
Hourglass2(d=3)-HPM38MB7.2496.3786.049BaiduCloud or GoogleDrive
Hourglass2(d=4)-CAB46MB7.1686.1235.684BaiduCloud or GoogleDrive
SAT2(d=3)-CAB40MB7.0525.9995.618BaiduCloud or GoogleDrive
Hourglass2(d=3)-CAB37MB6.9745.9835.647BaiduCloud or GoogleDrive

The performances of 3D pre-trained models are shown below. Accuracy is reported as the Normalised Mean Error (NME). The mean error is normalised by the square root of the ground truth bounding box size.

ModelModel SizeAFLW2000-3DDownload Link
SAT2(d=3)-CAB-3D40MB3.072BaiduCloud or GoogleDrive
Hourglass2(d=3)-CAB-3D37MB3.005BaiduCloud or GoogleDrive

Note: More pre-trained models will be added soon.

Environment

This repository has been tested under the following environment:

Installation

  1. Prepare the environment.

  2. Clone the repository.

  3. Type make to build necessary cxx libs.

Training

(1) Train stacked Scale Aggregation Topology (SAT) networks with channel aggregation block (CAB).

CUDA_VISIBLE_DEVICES='0' python train.py --network satnet --prefix ./model/model-sat2d3-cab/model --per-batch-size 16 --lr 1e-4 --lr-epoch-step '20,35,45'

(2) Train stacked Hourglass models with parallel and multi-scale (HPM) residual block.

CUDA_VISIBLE_DEVICES='0' python train.py --network hourglass --prefix ./model/model-hg2d3-hpm/model --per-batch-size 16 --lr 1e-4 --lr-epoch-step '20,35,45'

Testing

Evaluation

To evaluate pre-trained models on IBUG, COFW, 300W and AFLW2000-3D testset, you can use 'python test_rec_nme.py' to obtain the Normalised Mean Error (NME) on the testset. We give some examples below.

  1. Evaluate model Hourglass2(d=3)-CAB with 2D landmarks on IBUG testset.
python test_rec_nme.py --dataset ibug --prefix ./models/model-hg2d3-cab/model --epoch 0 --gpu 0 --landmark-type 2d
  1. Evaluate model SAT2(d=3)-CAB with 2D landmarks on COFW testset.
python test_rec_nme.py --dataset cofw_testset --prefix ./models/model-sat2d3-cab/model --epoch 0 --gpu 0 --landmark-type 2d
  1. Evaluate model SAT2(d=3)-HPM with 2D landmarks on 300W testset.
python test_rec_nme.py --dataset 300W --prefix ./models/model-hg2d3-hpm/model --epoch 0 --gpu 0 --landmark-type 2d
  1. Evaluate model Hourglass2(d=3)-CAB-3D with 3D landmarks on AFLW2000-3D testset.
python test_rec_nme.py --dataset AFLW2000-3D --prefix ./models/model-hg2d3-cab-3d/model --epoch 0 --gpu 0 --landmark-type 3d

Results

Results of 2D face alignment (inferenced from model Hourglass2(d=3)-CAB) are shown below.

<div align=center><img src="https://raw.githubusercontent.com/deepinx/sdu-face-alignment/master/sample-images/landmark_test_2d.png" width="700"/></div>

Results on ALFW2000-3D dataset (inferenced from model Hourglass2(d=3)-CAB-3D) are shown below.

<div align=center><img src="https://raw.githubusercontent.com/deepinx/sdu-face-alignment/master/sample-images/landmark_test_3d.png" width="720"/></div>

License

MIT LICENSE

Reference

@article{guo2018stacked,
  title={Stacked Dense U-Nets with Dual Transformers for Robust Face Alignment},
  author={Guo, Jia and Deng, Jiankang and Xue, Niannan and Zafeiriou, Stefanos},
  journal={arXiv preprint arXiv:1812.01936},
  year={2018}
}

@inproceedings{Deng2018Cascade,
  title={Cascade Multi-View Hourglass Model for Robust 3D Face Alignment},
  author={Deng, Jiankang and Zhou, Yuxiang and Cheng, Shiyang and Zaferiou, Stefanos},
  booktitle={2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018)},
  pages={399-403},
  year={2018},
}

@article{Bulat2018Hierarchical,
  title={Hierarchical binary CNNs for landmark localization with limited resources},
  author={Bulat, Adrian and Tzimiropoulos, Yorgos},
  journal={IEEE Transactions on Pattern Analysis & Machine Intelligence},
  year={2018},
}

@inproceedings{Jing2017Stacked,
  title={Stacked Hourglass Network for Robust Facial Landmark Localisation},
  author={Jing, Yang and Liu, Qingshan and Zhang, Kaihua and Jing, Yang and Liu, Qingshan and Zhang, Kaihua and Jing, Yang and Liu, Qingshan and Zhang, Kaihua},
  booktitle={IEEE Conference on Computer Vision & Pattern Recognition Workshops},
  year={2017},
}

Acknowledgment

The code is adapted based on an intial fork from the insightface repository.