Home

Awesome

Fashionformer ECCV-2022 Video,Poster

A simple, effective and unified baseline for human fashion segmentation and recognition (ECCV 2022)

Shilin Xu*, Xiangtai Li*, Jingbo Wang, Guangliang Cheng, Yunhai Tong, Dacheng Tao.

Figure

Introduction

We present a simple, effective, unified baseline for fashion segmentation and attribute recognition. The figure below shows that the entire architecture is the Encoder-Decoder framework, like DETR.

This codebase also contains the implementation of MaskAttribute-RCNN.

Figure

Fashionformer achieve new state-of-the-art results on three fashion segmentation datasets.

Requirements

We adopt the Open-MMLab codebase and use the specific version of mmdetection and mmcv. To run this code, make sure you have mmcv and mmdet in your environment.

DataSet

Fashionpedia Dataset

Images

Annotations

Detection: apparel object instance segmentation with localized attributes prediction:

Global attributes prediction:

path/to/Fashionpedia/
├── annotations/  # annotation json files
│   ├── attributes_train2020.json
│   ├── attributes_val2020.json
│   ├── instances-attributes_train2020.json
│   ├── instances-attributes_val2020.json
└── train/
└── test/
│   ├── train2017/    # train images
│   ├── val2017/      # val images
│   └── test2017/     # test images

ModaNet

Please see the details of this link.

DeepFashion

Please use the default setting by mmdetection.

Training and Testing

Training

# for single machine 
./tools/dist_train.sh $config $num_gpu
# for multi machine with slurm
./tools/slurm_train.sh $partition $job_name $config $work_dir

Testing

# for single machine 
./tools/dist_test.sh $config $checkpoint $num_gpu --eval segm
# for multi machine with slurm
./tools/slurm_test.sh $partition $job_name $config $checkpoint --eval segm

Demo Visulization

python demo/image_demo.py $img $config $checkpoint

Trained Model

We give the config to reproduce the Fashionformer and Mask-Attributes Mask-RCNN.

Fashionpedia

Fashionformer CheckPoints one drive and baidu yun Access Code: uvlc;

Acknowledgement

We build our codebase based on K-Net and mmdetection. Much thanks for their open-sourced code. In particular, we modify the K-Net the kernel prediction head with extra attribute query prediction, which makes a two-stream query(kernel) prediction framework.

Citation

If you find this repo is useful for your research, Please consider citing our paper:

@article{xu2022fashionformer,
  title={Fashionformer: A simple, Effective and Unified Baseline for Human Fashion Segmentation and Recognition},
  author={Xu, Shilin and Li, Xiangtai and Wang, Jingbo and Cheng, Guangliang and Tong, Yunhai and Tao, Dacheng},
  journal={ECCV},
  year={2022}
}