Home

Awesome

SegNeXt-FaceParser

<img src="./demo/00475.png" width="200" height="200"><img src="./demo/00475_seg_vis.png" width="200" height="200"><img src="./demo/00476.png" width="200" height="200"><img src="./demo/00476_seg_vis.png" width="200" height="200">

The repository contains a PyTorch implementation of a pre-trained face parser based on SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation (NeurIPS 2022).

The code is based on SegNext-official-repo. We use CelebAMask-HQ as the training data.

Results

Notes: ImageNet Pre-trained models can be found in TsingHua Cloud.

Rank 1 on Pascal VOC dataset: Leaderboard

CelebAMaskHQ

Image 027999 for training, 2800029999 for validation.

MethodBackbonePretrainedItersmIoU(ss)ParamsFLOPsConfigDownload
SegNeXtMSCAN-TIN-1K160K77.864M7GconfigGoogle Drive
SegNeXtMSCAN-SIN-1K160K78.1914M16GconfigGoogle Drive
SegNeXtMSCAN-BIN-1K160K78.9728M35GconfigGoogle Drive
SegNeXtMSCAN-LIN-1K160K79.3449M70GconfigGoogle Drive

ADE20K

MethodBackbonePretrainedItersmIoU(ss/ms)ParamsFLOPsConfigDownload
SegNeXtMSCAN-TIN-1K160K41.1/42.24M7GconfigTsingHua Cloud
SegNeXtMSCAN-SIN-1K160K44.3/45.814M16GconfigTsingHua Cloud
SegNeXtMSCAN-BIN-1K160K48.5/49.928M35GconfigTsingHua Cloud
SegNeXtMSCAN-LIN-1K160K51.0/52.149M70GconfigTsingHua Cloud

Cityscapes

MethodBackbonePretrainedItersmIoU(ss/ms)ParamsFLOPsConfigDownload
SegNeXtMSCAN-TIN-1K160K79.8/81.44M56GconfigTsingHua Cloud
SegNeXtMSCAN-SIN-1K160K81.3/82.714M125GconfigTsingHua Cloud
SegNeXtMSCAN-BIN-1K160K82.6/83.828M276GconfigTsingHua Cloud
SegNeXtMSCAN-LIN-1K160K83.2/83.949M578GconfigTsingHua Cloud

Notes: In this scheme, The number of FLOPs (G) is calculated on the input size of 512 $\times$ 512 for ADE20K, 2048 $\times$ 1024 for Cityscapes by torchprofile (recommended, highly accurate and automatic MACs/FLOPs statistics).

Installation

Install the dependencies and download ADE20K according to the guidelines in MMSegmentation. The code is based on MMSegmentation-v0.24.1.

pip install openmim
mim install mmcv-full==1.5.1 mmcls==0.20.1
cd egNeXt-FaceParser
python setup.py develop

Training

We use 8 GPUs for training by default. Run:

./tools/dist_train.sh /path/to/config 8

Evaluation

To evaluate the model, run:

./tools/dist_test.sh /path/to/config /path/to/checkpoint_file 8 --eval mIoU

FLOPs

Install torchprofile using

pip install torchprofile

To calculate FLOPs for a model, run:

bash tools/get_flops.py /path/to/config --shape 512 512

Contact

For technical problem, please create an issue.

Citation

If you find this repo useful for your research, please consider citing:

@misc{SegNeXt-FaceParser, 
  author={Zhian Liu}, 
  title={SegNeXt-FaceParser}, 
  year={2023}, 
  url={https://github.com/e4s2022/SegNeXt-FaceParser} 
}

@article{guo2022segnext,
  title={SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation},
  author={Guo, Meng-Hao and Lu, Cheng-Ze and Hou, Qibin and Liu, Zhengning and Cheng, Ming-Ming and Hu, Shi-Min},
  journal={arXiv preprint arXiv:2209.08575},
  year={2022}
}

@inproceedings{CelebAMask-HQ,
  title={MaskGAN: Towards Diverse and Interactive Facial Image Manipulation},
  author={Lee, Cheng-Han and Liu, Ziwei and Wu, Lingyun and Luo, Ping},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020}
}

@article{guo2022visual,
  title={Visual Attention Network},
  author={Guo, Meng-Hao and Lu, Cheng-Ze and Liu, Zheng-Ning and Cheng, Ming-Ming and Hu, Shi-Min},
  journal={arXiv preprint arXiv:2202.09741},
  year={2022}
}

@inproceedings{
    ham,
    title={Is Attention Better Than Matrix Decomposition?},
    author={Zhengyang Geng and Meng-Hao Guo and Hongxu Chen and Xia Li and Ke Wei and Zhouchen Lin},
    booktitle={International Conference on Learning Representations},
    year={2021},
}

Acknowledgment

Our implementation is mainly based on mmsegmentaion, Segformer and Enjoy-Hamburger. Thanks for their authors.

LICENSE

This repo is under the Apache-2.0 license. For commercial use, please contact the authors.