Home

Awesome

High-resolution Networks for FCOS

Introduction

This project contains the code of HRNet-FCOS, i.e., using High-resolution Networks (HRNets) as the backbones for the Fully Convolutional One-Stage Object Detection (FCOS) algorithm, which achieves much better object detection performance compared with the ResNet-FCOS counterparts while keeping a similar computation complexity. For more projects using HRNets, please go to our website.

Quick start

Installation

Please check INSTALL.md for installation instructions. You may also want to see the original README.md of FCOS.

Inference

The inference command line on coco minival split:

python tools/test_net.py \
    --config-file configs/fcos/fcos_hrnet_w32_5l_2x.yaml \
    MODEL.WEIGHT models/FCOS_hrnet_w32_5l_2x.pth \
    TEST.IMS_PER_BATCH 8

Please note that:

  1. If your model's name is different, please replace models/FCOS_hrnet_w32_5l_2x.pth with your own.
  2. If you enounter out-of-memory error, please try to reduce TEST.IMS_PER_BATCH to 1.
  3. If you want to evaluate a different model, please change --config-file to its config file (in configs/fcos) and MODEL.WEIGHT to its weights file.

For your convenience, we provide the following trained models.

FCOS ModelTraining mem (GB)Multi-scale trainingSyncBNTesting time / im# paramsGFLOPsAP (minival)Link
ResNet_50_5l_2x29.3NoNo71ms32.0M190.037.1-
HRNet_W18_5l_2x54.4NoNo72ms17.5M180.337.7model
HRNet_W18_5l_2x55.0YesYes72ms17.5M180.339.4model
ResNet_50_6l_2x58.2NoNo98ms32.7M529.037.1-
HRNet_W18_6l_2x88.1NoNo106ms18.1M515.137.8model
ResNet_101_5l_2x44.1YesNo74ms51.0M261.241.4model
HRNet_W32_5l_2x78.9YesNo87ms37.3M273.341.9model
HRNet_W32_5l_2x80.1YesYes87ms37.3M273.342.5model
ResNet_101_6l_2x71.0YesNo121ms51.6M601.041.5model
HRNet_W32_6l_2x108.6YesNo125ms37.9M608.042.1model
HRNet_W32_6l_2x109.9YesYes125ms37.9M608.042.9model
HRNet_W40_6l_3x128.0YesNo142ms54.1M682.942.6model

[1] 1x, 2x and 3x mean the model is trained for 90K, 180K and 270k iterations, respectively.
[2] 5l and 6l denote that we use feature pyramid with 5 levels and 6 levels, respectively.
[3] We provide model trained with Synchronous Batch Normalization (SyncBN).
[4] We report total training memory footprint on all GPUs instead of the memory footprint per GPU as in maskrcnn-benchmark.
[5] The inference speed of HRNet can get improved if the branches in the HRNet model can run in parallel.
[6] All results are obtained with a single model and without any test time data augmentation.

Training

The following command line will trains a fcos_hrnet_w32_5l_2x model on 8 GPUs with Synchronous Stochastic Gradient Descent (SGD):

python -m torch.distributed.launch \
    --nproc_per_node=8 \
    --master_port=$((RANDOM + 10000)) \
    tools/train_net.py \
    --config-file configs/fcos/fcos_hrnet_w32_5l_2x.yaml \
    MODEL.WEIGHT hrnetv2_w32_imagenet_pretrained.pth \
    MODEL.SYNCBN False \
    DATALOADER.NUM_WORKERS 4 \
    OUTPUT_DIR training_dir/fcos_hrnet_w32_5l_2x
    

Note that:

  1. If you want to use fewer GPUs, please change --nproc_per_node to the number of GPUs. No other settings need to be changed. The total batch size does not depends on nproc_per_node. If you want to change the total batch size, please change SOLVER.IMS_PER_BATCH in configs/fcos/fcos_hrnet_w32_5l_2x.yaml.
  2. If you want to use Synchronous Batch-Normalization (SyncBN), please change MODEL.SYNCBN to True. Note that this will lead to ~2x slower training speed when training on mulitple machines. You also need to fix the image padding size when using SyncBN, see here.
  3. The imagenet pre-trained model can be found here.
  4. The models will be saved into OUTPUT_DIR.
  5. If you want to train FCOS on your own dataset, please follow this instruction #54.

Contributing to the project

Any pull requests or issues are welcome.

Citations

Please consider citing the following papers in your publications if the project helps your research.

@article{sun2019deep,
  title={Deep High-Resolution Representation Learning for Human Pose Estimation},
  author={Sun, Ke and Xiao, Bin and Liu, Dong and Wang, Jingdong},
  journal={arXiv preprint arXiv:1902.09212},
  year={2019}
}

@article{tian2019fcos,
  title   =  {{FCOS}: Fully Convolutional One-Stage Object Detection},
  author  =  {Tian, Zhi and Shen, Chunhua and Chen, Hao and He, Tong},
  journal =  {arXiv preprint arXiv:1904.01355},
  year    =  {2019}
}

License

For academic use, this project is licensed under the 2-clause BSD License - see the LICENSE file for details. For commercial use, please contact the authors.