Home

Awesome

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting (official Pytorch implementation)

zero-shot This paper submitted to TIP is the extension of the previous Arxiv paper.

This project is adopted in the JDAI-CV/fast-reid and PP-Human of PaddleDetection.

This project aims to

  1. provide a strong baseline for Pedestrian Attribute Recognition and Multi-Label Classification.
  2. provide two new datasets RAPzs and PETAzs following zero-shot pedestrian identity setting.
  3. provide a general training pipeline for pedestrian attribute recognition and multi-label classification task.

This project provide

  1. DDP training, which is mainly used for multi-label classifition.
  2. Training on all attributes, testing on "selected" attribute. Because the proportion of positive samples for other attributes is less than a threshold, such as 0.01.
    1. For PETA and PETAzs, 35 of the 105 attributes are selected for performance evaluation.
    2. For RAPv1, 51 of the 92 attributes are selected for performance evaluation.
    3. For RAPv2 and RAPzs, 54 and 53 of the 152 attributes are selected for performance evaluation.
    4. For PA100k, all attributes are selected for performance evaluation.
    • However, training on all attributes can not bring consistent performance improvement on various datasets.
  3. EMA model.
  4. Transformer-base model, such as swin-transformer (with a huge performance improvement) and vit.
  5. Convenient dataset info file like dataset_all.pkl

Dataset Info

Performance

Pedestrian Attribute Recognition

DatasetsModelsmaAccPrecRecF1
PA100kresnet5080.2179.1587.7987.0187.40
--resnet50*79.8579.1389.4585.4087.38
--resnet50 + EMA81.9780.2088.0688.1788.11
--bninception79.1378.1987.4286.2186.81
--TresnetM74.4668.7279.8280.7180.26
--swin_s82.1980.3587.8588.5188.18
--vit_s79.4077.6186.4186.2286.32
--vit_b81.0179.3887.6087.4987.55
PETAresnet5083.9678.6587.0885.6286.35
PETAzsresnet5071.4358.6974.4169.8272.04
RAPv1resnet5079.2767.9880.1979.7179.95
RAPv2resnet5078.5266.0977.2080.2378.68
RAPzsresnet5071.7664.8378.7576.6077.66

Multi-label Classification

DatasetsModelsmAPCPCRCF1OPOROF1
COCOresnet10182.7584.1772.0777.6585.1675.4780.02

Pretrained Models

Dependencies

Get Started

  1. Run git clone https://github.com/valencebond/Rethinking_of_PAR.git
  2. Create a directory to dowload above datasets.
    cd Rethinking_of_PAR
    mkdir data
    
  3. Prepare datasets to have following structure:
    ${project_dir}/data
        PETA
            images/
            PETA.mat
            dataset_all.pkl
            dataset_zs_run0.pkl
        PA100k
            data/
            dataset_all.pkl
        RAP
            RAP_dataset/
            RAP_annotation/
            dataset_all.pkl
        RAP2
            RAP_dataset/
            RAP_annotation/
            dataset_zs_run0.pkl
        COCO14
            train2014/
            val2014/
            ml_anno/
                category.json
                coco14_train_anno.pkl
                coco14_val_anno.pkl
    
  4. Train baseline based on resnet50
    sh train.sh
    

Acknowledgements

Codes are based on the repository from Dangwei Li and Houjing Huang. Thanks for their released code.

Citation

If you use this method or this code in your research, please cite as:

@article{jia2021rethinking,
  title={Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting},
  author={Jia, Jian and Huang, Houjing and Chen, Xiaotang and Huang, Kaiqi},
  journal={arXiv preprint arXiv:2107.03576},
  year={2021}
}

@inproceedings{jia2021spatial,
 title={Spatial and semantic consistency regularizations for pedestrian attribute recognition},
 author={Jia, Jian and Chen, Xiaotang and Huang, Kaiqi},
 booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
 pages={962--971},
 year={2021}

}