Home

Awesome

Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in the Wild

Introduction

This is the code of paper Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in the Wild. We propose a novel facial landmark detector, PIPNet, that is fast, accurate, and robust. PIPNet can be trained under two settings: (1) supervised learning; (2) generalizable semi-supervised learning (GSSL). With GSSL, PIPNet has better cross-domain generalization performance by utilizing massive amounts of unlabeled data across domains.

<img src="images/speed.png" alt="speed" width="640px"> Figure 1. Comparison to existing methods on speed-accuracy tradeoff, tested on WFLW full test set (closer to bottom-right corner is better).<br><br> <img src="images/detection_heads.png" alt="det_heads" width="512px"> Figure 2. Comparison of different detection heads.<br>

Installation

  1. Install Python3 and PyTorch >= v1.1
  2. Clone this repository.
git clone https://github.com/jhb86253817/PIPNet.git
  1. Install the dependencies in requirements.txt.
pip install -r requirements.txt

Demo

  1. We use a modified version of FaceBoxes as the face detector, so go to folder FaceBoxesV2/utils, run sh make.sh to build for NMS.
  2. Back to folder PIPNet, create two empty folders logs and snapshots. For PIPNets, you can download our trained models from here, and put them under folder snapshots/DATA_NAME/EXPERIMENT_NAME/.
  3. Edit run_demo.sh to choose the config file and input source you want, then run sh run_demo.sh. We support image, video, and camera as the input. Some sample predictions can be seen as follows.

Training

Supervised Learning

Datasets: 300W, COFW, WFLW, AFLW, LaPa

  1. Download the datasets from official sources, then put them under folder data. The folder structure should look like this:
PIPNet
-- FaceBoxesV2
-- lib
-- experiments
-- logs
-- snapshots
-- data
   |-- data_300W
       |-- afw
       |-- helen
       |-- ibug
       |-- lfpw
   |-- COFW
       |-- COFW_train_color.mat
       |-- COFW_test_color.mat
   |-- WFLW
       |-- WFLW_images
       |-- WFLW_annotations
   |-- AFLW
       |-- flickr
       |-- AFLWinfo_release.mat
   |-- LaPa
       |-- train
       |-- val
       |-- test
  1. Go to folder lib, preprocess a dataset by running python preprocess.py DATA_NAME. For example, to process 300W:
python preprocess.py data_300W
  1. Back to folder PIPNet, edit run_train.sh to choose the config file you want. Then, train the model by running:
sh run_train.sh

Generalizable Semi-supervised Learning

Datasets:

  1. Download 300W, COFW, and WFLW as in the supervised learning setting. Download annotations of COFW-68 test from here. For 300W+CelebA, you also need to download the in-the-wild CelebA images from here, and the face bounding boxes detected by us. The folder structure should look like this:
PIPNet
-- FaceBoxesV2
-- lib
-- experiments
-- logs
-- snapshots
-- data
   |-- data_300W
       |-- afw
       |-- helen
       |-- ibug
       |-- lfpw
   |-- COFW
       |-- COFW_train_color.mat
       |-- COFW_test_color.mat
   |-- WFLW
       |-- WFLW_images
       |-- WFLW_annotations
   |-- data_300W_COFW_WFLW
       |-- cofw68_test_annotations
       |-- cofw68_test_bboxes.mat
   |-- CELEBA
       |-- img_celeba
       |-- celeba_bboxes.txt
   |-- data_300W_CELEBA
       |-- cofw68_test_annotations
       |-- cofw68_test_bboxes.mat
  1. Go to folder lib, preprocess a dataset by running python preprocess_gssl.py DATA_NAME. To process data_300W_COFW_WFLW, run
    python preprocess_gssl.py data_300W_COFW_WFLW
    
    To process data_300W_CELEBA, run
    python preprocess_gssl.py CELEBA
    
    and
    python preprocess_gssl.py data_300W_CELEBA
    
  2. Back to folder PIPNet, edit run_train.sh to choose the config file you want. Then, train the model by running:
sh run_train.sh

Evaluation

  1. Edit run_test.sh to choose the config file you want. Then, test the model by running:
sh run_test.sh

Community

Citation

@article{JLS21,
  title={Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in the Wild},
  author={Haibo Jin and Shengcai Liao and Ling Shao},
  journal={International Journal of Computer Vision},
  publisher={Springer Science and Business Media LLC},
  ISSN={1573-1405},
  url={http://dx.doi.org/10.1007/s11263-021-01521-4},
  DOI={10.1007/s11263-021-01521-4},
  year={2021},
  month={Sep}
}

Acknowledgement

We thank the following great works: