Awesome
This is the project page for the paper:
<!-- :star:**Highlights:** - **GPU Friendly**: Four 1080Ti/2080Ti GPUs can handle the training for R50, R101 backbones with ISTR. - **High Performance**: On COCO test-dev, ISTR-R50-3x gets 46.8/38.6 box/mask AP, and ISTR-R101-3x gets 48.1/39.9 box/mask AP. -->
Updates
- (2022.03.09) New codes for ISTR-PCA, ISTR-DCT, and ISTR-SMT with better performance and speed have been released.
- (2021.05.03) The project page for ISTR is avaliable.
Method | backbone | fps | box AP | mask AP | link |
---|---|---|---|---|---|
ISTR-PCA | R50-FPN | 13.0 | 46.7 | 39.8 | 7p58 (google drive) |
ISTR-DCT | R50-FPN | 12.5 | 46.9 | 40.2 | ibi3 |
ISTR-SMT | R50-FPN | 10.4 | 47.4 | 41.7 | 73bs (google drive) |
ISTR-PCA | R101-FPN | 10.7 | 48.0 | 41.1 | 5rcj |
ISTR-DCT | R101-FPN | 10.3 | 48.3 | 41.6 | 0mdl (google drive) |
ISTR-SMT | R101-FPN | 8.9 | 48.8 | 42.9 | qbr8 (google drive) |
ISTR-SMT | Swin-L | 3.5 | 55.8 | 49.2 | nuj8 (google drive) |
ISTR-SMT@1088 | Swin-L | 2.9 | 56.4 | 49.7 | 9uj8 |
- The inference time is evaluated with a single 2080Ti GPU.
- We use the models pre-trained on ImageNet using torchvision. The ImageNet pre-trained ResNet-101 backbone is obtained from SparseR-CNN.
Installation
The codes are built on top of Detectron2, SparseR-CNN, and AdelaiDet.
Requirements
- Python=3.8
- PyTorch=1.6.0, torchvision=0.7.0, cudatoolkit=10.1
- OpenCV for visualization
Steps
- Install the repository (we recommend to use Anaconda for installation.)
conda create -n ISTR python=3.8 -y
conda activate ISTR
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch
or (conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch)
pip install opencv-python
pip install scipy
pip install shapely
git clone https://github.com/hujiecpp/ISTR.git
cd ISTR
python setup.py build develop
- Link coco dataset path
ln -s /coco_dataset_path/coco ./datasets
- Train ISTR (e.g., with ResNet50 backbone)
python projects/ISTR/train_net.py --num-gpus 4 --config-file projects/ISTR/configs/ISTR-R50-3x.yaml
- Evaluate ISTR (e.g., with ResNet50 backbone)
python projects/ISTR/train_net.py --num-gpus 4 --config-file projects/ISTR/configs/ISTR-R50-3x.yaml --eval-only MODEL.WEIGHTS ./output/model_final.pth
- Visualize the detection and segmentation results (e.g., with ResNet50 backbone)
python demo/demo.py --config-file projects/ISTR/configs/ISTR-R50-3x.yaml --input input1.jpg --output ./output --confidence-threshold 0.4 --opts MODEL.WEIGHTS ./output/model_final.pth
Citation
If our paper helps your research, please cite it in your publications:
@article{hu2021istr,
title={Istr: End-to-end instance segmentation with transformers},
author={Hu, Jie and Cao, Liujuan and Lu, Yao and Zhang, ShengChuan and Wang, Yan and Li, Ke and Huang, Feiyue and Shao, Ling and Ji, Rongrong},
journal={arXiv preprint arXiv:2105.00637},
year={2021}
}
@article{hu2024istr,
author={Hu, Jie and Lu, Yao and Zhang, Shengchuan and Cao, Liujuan},
title={ISTR: Mask-Embedding-Based Instance Segmentation Transformer},
journal={IEEE Transactions on Image Processing},
year={2024},
volume={33},
pages={2895-2907},
doi={10.1109/TIP.2024.3385980}
}