Home

Awesome

<p align="center"> <h2 align="center"><strong>Scaling Efficient Masked Autoencoder Learning on Large Remote Sensing Dataset</strong></h2> <p align="center"> Fengxiang Wang<sup>1</sup>&nbsp;&nbsp;&nbsp; Hongzhen Wang<sup>2,‡</sup>&nbsp;&nbsp;&nbsp; Di Wang<sup>3</sup>&nbsp;&nbsp;&nbsp; Zonghao Guo<sup>4</sup></br> Zhenyu Zhong<sup>5</sup>&nbsp;&nbsp;&nbsp; Long Lan<sup>1,‡</sup>&nbsp;&nbsp; Jing Zhang<sup>6</sup>&nbsp Zhiyuan Liu<sup>2</sup> &nbsp;&nbsp; Maosong Sun<sup>2</sup>&nbsp;&nbsp;&nbsp; </br></br> <sup>1</sup> National University of Defense Technology&nbsp;&nbsp;&nbsp; <sup>2</sup>Tsinghua University &nbsp;&nbsp;&nbsp; <sup>3</sup>Wuhan University&nbsp;&nbsp;</br> <sup>4</sup>University of Chinese Academic of Sciences&nbsp;&nbsp; <sup>5</sup>Nankai University&nbsp;&nbsp; <sup>6</sup>The University of Sydney </p>

Intruduction

Todo List

Updates

Outline

RS-4M

RS-4M dataset contains about 4 million high-quality remote sensing optical images, which is four times larger than previous representative remote sensing datasets.

Examples of RS-4M

<img src="./Figures/RS-4M.png" width="700">

Experiments on RS-4M

RS-4M offers a significantly larger and more diverse image set compared to previous datasets. To evaluate its effectiveness, we pre-train a ViT-Base model using the vanilla MAE method. For comparison, we use the MillionAID dataset, maintaining an equal number of data points during training: 800 epochs for MillionAID's 1 million images and 200 epochs for our RS-4M dataset.

DatasetPretrained modelImages NumberEpochSence ClassificationSence ClassificationObject DetectionObject DetectionSemantic SegmentationSemantic Segmentation
AIDRESISC-45DIORDIOR-RLoveDASpaceNetv1
OA (TR=20%/50%)OA (TR=20%/50%)mAP50mAP50mIoUmF1
MillionAIDWeights1 million80094.92/97.3889.20/93.6071.8062.3351.2479.24
RS-4MWeights2 million40096.64/98.1091.80/94.3173.9065.9552.8679.37
RS-4MWeights3 million26796.67/98.1892.24/94.4175.4067.0752.3979.37
RS-4MWeights4 million20096.10/98.0392.38/94.3074.7066.2652.7579.23
RS-4MWeights4 million80096.88/98.2292.44/94.4375.4067.3552.8079.41

SelectiveMAE

:gear: Installation

For details related to installation, kindly refer to INSTALL.md.

:blue_car: Pretraining

To learn more usage about the pretraining codes, kindly refer to PRETRAIN.md.

:rocket: Results on downstream tasks

ModelPublicationBackboneSence ClassificationSence ClassificationObject DetectionObject DetectionSemantic SegmentationSemantic Segmentation
AIDRESISC-45DIORDIOR-RLoveDASpaceNetv1
OA (TR=20%/50%)OA (TR=20%/50%)mAP50mAP50mIoUmF1
SeCoICCV'21ResNet-5093.47/95.9989.64/92.91--43.6377.09
GASSLICCV'21ResNet-5093.55/95.9290.86/93.0667.4065.6548.7678.51
TOVJSTARS'23ResNet-5095.16/97.0990.97/93.7970.1666.3349.70-
CACoCVPR'23ResNet-5090.88/95.0588.28/91.9466.9164.1048.8977.94
SatMAENIPS'22ViT-L95.02/96.9491.72/94.1070.8965.66-78.07
ScaleMAEICCV'23ViT-L96.44/97.5892.63/95.0473.8166.47--
SSL4EOGRSM'23ViT-S91.06/94.7487.60/91.2764.8261.23--
RingMoTGRS'22Swin-B96.90/98.3494.25/95.6775.90---
SatLasICCV'23Swin-B94.96/97.3892.16/94.7074.1067.59--
GFMICCV'23Swin-B95.47/97.0992.73/94.6472.8467.67--
RVSATGRS'23ViT-B+RVSA97.03/98.5093.93/95.6975.8068.0651.95-
SelectiveMAE-ViT-B96.78/98.1293.35/94.5875.7067.7853.0579.50
SelectiveMAE -ViT-L97.25/98.4894.57/95.7777.8070.3154.3179.46

License

This work is under the Apache License Version 2.0, while some specific operations in this codebase might be with other licenses. Please refer to LICENSE.md for a more careful check, if you are using our code for commercial matters.

Acknowledgements