Awesome

AlphaRotate: A Rotation Detection Benchmark using TensorFlow

:rocket::rocket::rocket: News: MMRotate has been released at https://github.com/open-mmlab/mmrotate <img src="https://img.shields.io/github/stars/open-mmlab/mmrotate?style=social" /> :rocket::rocket::rocket:

Abstract

AlphaRotate is mainly maintained by Xue Yang with Shanghai Jiao Tong University supervised by Prof. Junchi Yan.

Papers and codes related to remote sensing/aerial image detection: DOTA-DOAI <img src="https://img.shields.io/github/stars/SJTU-Thinklab-Det/DOTA-DOAI?style=social" />.

Techniques:

The above-mentioned rotation detectors are all modified based on the following horizontal detectors:

Faster RCNN: TF code <img src="https://img.shields.io/github/stars/DetectionTeamUCAS/Faster-RCNN_Tensorflow?style=social" />
R-FCN: TF code <img src="https://img.shields.io/github/stars/DetectionTeamUCAS/R-FCN_Tensorflow?style=social" />
FPN: TF code1 <img src="https://img.shields.io/github/stars/DetectionTeamUCAS/FPN_Tensorflow?style=social" />, TF code2 (Deprecated) <img src="https://img.shields.io/github/stars/yangxue0827/FPN_Tensorflow?style=social" />
Cascade RCNN: TF code <img src="https://img.shields.io/github/stars/DetectionTeamUCAS/Cascade-RCNN_Tensorflow?style=social" />
Cascade FPN RCNN: TF code <img src="https://img.shields.io/github/stars/DetectionTeamUCAS/Cascade_FPN_Tensorflow?style=social" />
RetinaNet: TF code <img src="https://img.shields.io/github/stars/DetectionTeamUCAS/RetinaNet_Tensorflow?style=social" />
RefineDet: MxNet code <img src="https://img.shields.io/github/stars/DetectionTeamUCAS/RefineDet_MxNet?style=social" />
FCOS: TF code <img src="https://img.shields.io/github/stars/DetectionTeamUCAS/FCOS_Tensorflow?style=social" />, MxNet code <img src="https://img.shields.io/github/stars/DetectionTeamUCAS/FCOS_GluonCV?style=social" />

Projects

Latest Performance

All trained weights can also be downloaded from HuggingFace.

DOTA (Task1)

Baseline

Backbone	Neck	Training/test dataset	Data Augmentation	Epoch	NMS
ResNet50_v1d 600->800	FPN	trainval/test	×	13 (AP50) or 17 (AP50:95) is enough for baseline (default is 13)	gpu nms (slightly worse <1% than cpu nms but faster)

Method	Baseline	DOTA1.0	DOTA1.5	DOTA2.0	Model	Anchor	Angle Pred.	Reg. Loss	Angle Range	Configs
-	RetinaNet-R	67.25	56.50	42.04	Baidu Drive (bi8b)	R	Reg. (∆⍬)	smooth L1	[-90,0)	dota1.0, dota1.5, dota2.0
-	RetinaNet-H	64.17	56.10	43.06	Baidu Drive (bi8b)	H	Reg. (∆⍬)	smooth L1	[-90,90)	dota1.0, dota1.5, dota2.0
-	RetinaNet-H	65.33	57.21	44.58	Baidu Drive (bi8b)	H	Reg. (sin⍬, cos⍬)	smooth L1	[-90,90)	dota1.0, dota1.5, dota2.0
-	RetinaNet-H	65.73	58.87	44.16	Baidu Drive (bi8b)	H	Reg. (∆⍬)	smooth L1	[-90,0)	dota1.0, dota1.5, dota2.0
IoU-Smooth L1	RetinaNet-H	66.99	59.17	46.31	Baidu Drive (qcvc)	H	Reg. (∆⍬)	iou-smooth L1	[-90,0)	dota1.0, dota1.5, dota2.0
RIDet	RetinaNet-H	66.06	58.91	45.35	Baidu Drive (njjv)	H	Quad.	hungarian loss	-	dota1.0, dota1.5, dota2.0
RSDet	RetinaNet-H	67.27	61.42	46.71	Baidu Drive (2a1f)	H	Quad.	modulated loss	-	dota1.0, dota1.5, dota2.0
CSL	RetinaNet-H	67.38	58.55	43.34	Baidu Drive (sdbb)	H	Cls.: Gaussian (r=1, w=10)	smooth L1	[-90,90)	dota1.0, dota1.5, dota2.0
DCL	RetinaNet-H	67.39	59.38	45.46	Baidu Drive (m7pq)	H	Cls.: BCL (w=180/256)	smooth L1	[-90,90)	dota1.0, dota1.5, dota2.0
-	FCOS	67.69	61.05	48.10	Baidu Drive (pic4)	-	Quad	smooth L1	-	dota1.0, dota1.5, dota2.0
RSDet++	FCOS	67.91	62.18	48.81	Baidu Drive (8ww5)	-	Quad	modulated loss	-	dota1.0, dota1.5 dota2.0
GWD	RetinaNet-H	68.93	60.03	46.65	Baidu Drive (7g5a)	H	Reg. (∆⍬)	gwd	[-90,0)	dota1.0, dota1.5, dota2.0
GWD + SWA	RetinaNet-H	69.92	60.60	47.63	Baidu Drive (qcn0)	H	Reg. (∆⍬)	gwd	[-90,0)	dota1.0, dota1.5, dota2.0
BCD	RetinaNet-H	71.23	60.78	47.48	Baidu Drive (0puk)	H	Reg. (∆⍬)	bcd	[-90,0)	dota1.0, dota1.5, dota2.0
KLD	RetinaNet-H	71.28	62.50	47.69	Baidu Drive (o6rv)	H	Reg. (∆⍬)	kld	[-90,0)	dota1.0, dota1.5, dota2.0
KFIoU	RetinaNet-H	70.64	62.71	48.04	Baidu Drive (o72o)	H	Reg. (∆⍬)	kfiou	[-90,0)	dota1.0, dota1.5, dota2.0
KFIoU<sup>*</sup>	RetinaNet-H	71.60	-	48.94	Baidu Drive (o72o)	H	Reg. (∆⍬)	kfiou	[-90,0)	dota1.0, dota2.0
R<sup>3</sup>Det	RetinaNet-H	70.66	62.91	48.43	Baidu Drive (n9mv)	H->R	Reg. (∆⍬)	smooth L1	[-90,0)	dota1.0, dota1.5, dota2.0
DCL	R<sup>3</sup>Det	71.21	61.98	48.71	Baidu Drive (eg2s)	H->R	Cls.: BCL (w=180/256)	iou-smooth L1	[-90,0)->[-90,90)	dota1.0, dota1.5, dota2.0
GWD	R<sup>3</sup>Det	71.56	63.22	49.25	Baidu Drive (jb6e)	H->R	Reg. (∆⍬)	smooth L1->gwd	[-90,0)	dota1.0, dota1.5, dota2.0
BCD	R<sup>3</sup>Det	72.22	63.53	49.71	Baidu Drive (v60g)	H->R	Reg. (∆⍬)	bcd	[-90,0)	dota1.0, dota1.5, dota2.0
KLD	R<sup>3</sup>Det	71.73	65.18	50.90	Baidu Drive (tq7f)	H->R	Reg. (∆⍬)	kld	[-90,0)	dota1.0, dota1.5, dota2.0
KFIoU	R<sup>3</sup>Det	72.28	64.69	50.41	Baidu Drive (u77v)	H->R	Reg. (∆⍬)	kfiou	[-90,0)	dota1.0, dota1.5, dota2.0
-	R<sup>2</sup>CNN (Faster-RCNN)	72.27	66.45	52.35	Baidu Drive (02s5)	H->R	Reg. (∆⍬)	smooth L1	[-90,0)	dota1.0, dota1.5 dota2.0

SOTA

Method	Backbone	DOTA1.0	Model	MS	Data Augmentation	Epoch	Configs
R<sup>2</sup>CNN-BCD	ResNet152_v1d-FPN	79.54	Baidu Drive (h2u1)	√	√	34	dota1.0
RetinaNet-BCD	ResNet152_v1d-FPN	78.52	Baidu Drive (0puk)	√	√	51	dota1.0
R<sup>3</sup>Det-BCD	ResNet50_v1d-FPN	79.08	Baidu Drive (v60g)	√	√	51	dota1.0
R<sup>3</sup>Det-BCD	ResNet152_v1d-FPN	79.95	Baidu Drive (v60g)	√	√	51	dota1.0

Note:

Single GPU training: SAVE_WEIGHTS_INTE = iter_epoch * 1 (DOTA1.0: iter_epoch=27000, DOTA1.5: iter_epoch=32000, DOTA2.0: iter_epoch=40000)
Multi-GPU training (better): SAVE_WEIGHTS_INTE = iter_epoch * 2

My Development Environment

python3.5 (anaconda recommend)
cuda 10.0
opencv-python 4.1.1.26 (important)
tfplot 0.2.0 (optional)
tensorflow-gpu 1.13
tqdm 4.54.0
Shapely 1.7.1

Installation

Manual configuration (cuda version < 11)

pip install -r requirements.txt
pip install -v -e .  # or "python setup.py develop"

Or, you can simply install AlphaRotate with the following command:

pip install alpharotate  # Not suitable for dev.

Docker (cuda version < 11)

docker images: yangxue2docker/yx-tf-det:tensorflow1.13.1-cuda10-gpu-py3

Note: For 30xx series graphics cards (cuda version >= 11), I recommend this blog to install tf1.xx, or download image from tensorflow-release-notes according to your development environment, e.g. nvcr.io/nvidia/tensorflow:20.11-tf1-py3

cd alpharotate/libs/utils/cython_utils
rm *.so
rm *.c
rm *.cpp
python setup.py build_ext --inplace (or make)

cd alpharotate/libs/utils/
rm *.so
rm *.c
rm *.cpp
python setup.py build_ext --inplace

Download Model

Pretrain weights

Download a pretrain weight you need from the following three options, and then put it to $PATH_ROOT/dataloader/pretrained_weights.

MxNet pretrain weights (recommend in this repo, default in NET_NAME): resnet_v1d, resnet_v1b, refer to gluon2TF.

Tensorflow pretrain weights: resnet50_v1, resnet101_v1, resnet152_v1, efficientnet, mobilenet_v2, darknet53 (Baidu Drive (1jg2), Google Drive).
PyTorch pretrain weights, refer to pretrain_zoo.py and Others.

Trained weights

Please download trained models by this project, then put them to $PATH_ROOT/output/pretained_weights.

Train

If you want to train your own dataset, please note:

(1) Select the detector and dataset you want to use, and mark them as #DETECTOR and #DATASET (such as #DETECTOR=retinanet and #DATASET=DOTA)
(2) Modify parameters (such as CLASS_NUM, DATASET_NAME, VERSION, etc.) in $PATH_ROO./configs/#DATASET/#DETECTOR/cfgs_xxx.py
(3) Copy $PATH_ROO./configs/#DATASET/#DETECTOR/cfgs_xxx.py to $PATH_ROO./configs/cfgs.py
(4) Add category information in $PATH_ROOT/libs/label_name_dict/label_dict.py     
(5) Add data_name to $PATH_ROOT/dataloader/dataset/read_tfrecord.py

Make tfrecord
If image is very large (such as DOTA dataset), the image needs to be cropped. Take DOTA dataset as a example:

cd $PATH_ROOT/dataloader/dataset/DOTA
python data_crop.py

If image does not need to be cropped, just convert the annotation file into xml format, refer to example.xml.

cd $PATH_ROOT/dataloader/dataset/  
python convert_data_to_tfrecord.py --root_dir='/PATH/TO/DOTA/' 
                                   --xml_dir='labeltxt'
                                   --image_dir='images'
                                   --save_name='train' 
                                   --img_format='.png' 
                                   --dataset='DOTA'

Start training

cd $PATH_ROOT/tools/#DETECTOR
python train.py

Test

For large-scale image, take DOTA dataset as a example (the output file or visualization is in $PATH_ROOT/tools/#DETECTOR/test_dota/VERSION):

cd $PATH_ROOT/tools/#DETECTOR
python test_dota.py --test_dir='/PATH/TO/IMAGES/'  
                    --gpus=0,1,2,3,4,5,6,7  
                    -ms (multi-scale testing, optional)
                    -s (visualization, optional)

or (recommend in this repo, better than multi-scale testing)

python test_dota_sota.py --test_dir='/PATH/TO/IMAGES/'  
                         --gpus=0,1,2,3,4,5,6,7  
                         -s (visualization, optional)

Notice: In order to set the breakpoint conveniently, the read and write mode of the file is' a+'. If the model of the same #VERSION needs to be tested again, the original test results need to be deleted.

For small-scale image, take HRSC2016 dataset as a example:

cd $PATH_ROOT/tools/#DETECTOR
python test_hrsc2016.py --test_dir='/PATH/TO/IMAGES/'  
                        --gpu=0
                        --image_ext='bmp'
                        --test_annotation_path='/PATH/TO/ANNOTATIONS'
                        -s (visualization, optional)

Tensorboard

cd $PATH_ROOT/output/summary
tensorboard --logdir=.

Citation

If you find our code useful for your research, please consider cite.

@article{yang2021alpharotate,
    author  = {Yang, Xue and Zhou, Yue and Yan, Junchi},
    title   = {AlphaRotate: A Rotation Detection Benchmark using TensorFlow},
    journal = {arXiv preprint arXiv:2111.06677},
    year    = {2021},
}

Reference

1、https://github.com/endernewton/tf-faster-rcnn
2、https://github.com/zengarden/light_head_rcnn
3、https://github.com/tensorflow/models/tree/master/research/object_detection
4、https://github.com/fizyr/keras-retinanet