Awesome
RMSIN
This repository is the offical implementation for "Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation."
Setting Up
Preliminaries
The code has been verified to work with PyTorch v1.7.1 and Python 3.7.
- Clone this repository.
- Change directory to root of this repository.
Package Dependencies
- Create a new Conda environment with Python 3.7 then activate it:
conda create -n RMSIN python==3.7
conda activate RMSIN
- Install PyTorch v1.7.1 with a CUDA version that works on your cluster/machine (CUDA 10.2 is used in this example):
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.2 -c pytorch
- Install the packages in
requirements.txt
viapip
:
pip install -r requirements.txt
The Initialization Weights for Training
- Create the
./pretrained_weights
directory where we will be storing the weights.
mkdir ./pretrained_weights
- Download pre-trained classification weights of
the Swin Transformer,
and put the
pth
file in./pretrained_weights
. These weights are needed for training to initialize the model.
Datasets
We perform all experiments on our proposed dataset RRSIS-D. RRSIS-D is a new Referring Remote Sensing Image Segmentation benchmark which containes 17,402 image-caption-mask triplets. It can be downloaded from Google Drive or Baidu Netdisk (access code: sjoe).
Usage
- Download our dataset.
- Copy all the downloaded files to
./refer/data/
. The dataset folder should be like this:
$DATA_PATH
├── rrsisd
│ ├── refs(unc).p
│ ├── instances.json
└── images
└── rrsisd
├── JPEGImages
├── ann_split
Training
We use DistributedDataParallel from PyTorch for training. To run on 4 GPUs (with IDs 0, 1, 2, and 3) on a single node:
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node 4 --master_port 12345 train.py --dataset rrsisd --model_id RMSIN --epochs 40 --img_size 480 2>&1 | tee ./output
Testing
python test.py --swin_type base --dataset rrsisd --resume ./your_checkpoints_path --split val --workers 4 --window12 --img_size 480
Acknowledgements
Code in this repository is built on LAVT. We'd like to thank the authors for open sourcing their project.