Awesome
Mask Auto-Labeler: The Official Implementation
Vision Transformers are Good Mask Auto-Labelers
Shiyi Lan, Xitong Yang, Zhiding Yu, Zuxuan Wu, Jose M. Alvarez, Anima Anandkumar
Accepted by Conference on Computer Vision and Pattern Recognition (CVPR) 2023.
Installation
- Please refer to the dockerfile in the root directory for environment specs. We also provide the docker image here.
Training
Phase 1: Mask Auto-labeling
python main.py
Phase 2: Instance Segmentation Models
We copy the training scripts from mmdet.
To train a model, e.g. ResNet-50/SOLOv2, with 8 GPUs
cd mmdet;
bash tools/dist_train.sh configs/MALMask/solov2_r50_fpn_3x_coco_mal.py 8
For more detail, please refer the documentation or github repo of mmdetection.
Inference and Evaluation
Phase 1: Generating Mask Psuedo-labels
python main.py --resume PATH/TO/WEIGHTS --label_dump_path PATH/TO/PSUEDO_LABELS_OUTPUT --not_eval_mask
Phase 2: Evaluation and Inference of Instance Segmentation Models
To evaluate an instance segmentation model, e.g. ResNet-50/SOLOv2, with 8 GPUs:
bash tools/dist_test.sh configs/MALMask/solov2_r50_fpn_3x_coco_mal.py solov2_r50_fpn_3x_coco_essenco/latest.pth 8 --eval segm
To generate results of instance segmentation models, e.g. ResNet-50/SOLOv2, with 8 GPUs:
bash tools/dist_test.sh configs/MALMask/solov2_r50_fpn_3x_coco_mal.py solov2_r50_fpn_3x_coco_essenco/latest.pth 8 --format-only --options "jsonfile_prefix=work_dirs/solov2_r50_fpn_3x_coco_essenco/test-dev.json"
For more detail, please refer the documentation or github repo of mmdetection.
Phase 1: Mask Auto-labeling
Trained Weights
ViT-MAE-base (COCO) | MAL-ViT-base (LVIS v1.0) |
---|---|
download | download |
Mask Pseudo-labels
MAL-ViT-base (COCO train2017) | MAL-ViT-base (LVIS v1.0 train) |
---|---|
download | download |
Phase 2: Instance Segmentation Models
COCO
Encoder | Decoder | weights |
---|---|---|
ResNet-50 | SOLOv2 | download |
ResNet-101-DCN | SOLOv2 | download |
ResNeXt-101-DCN | SOLOv2 | download |
ConvNeXt-s | Cascade MR-CNN | download |
ConvNeXt-b | Cascade MR-CNN | download |
Swin-s | Mask2Former | download |
F.A.Q.
It seems like MIL loss is using mask labels for training?
No, we do not use mask. Check this
I met errors during training/testing and MMCV exists in the error log, how do I do?
You have to rebuild your own docker since your nvidia driver version is different from mine and there are some customized operators in MMCV.
LICENSE
Copyright © 2022, NVIDIA Corporation. All rights reserved.
This work is made available under the Nvidia Source Code License-NC. Click here to view a copy of this license.
The pre-trained models are shared under CC-BY-NC-SA-4.0. If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
For business inquiries, please visit our website and submit the form: NVIDIA Research Licensing
Acknowledgement
This repository is partly based on Pytorch-image-models (timm), MMDetection, and DINO. We leverage PyTorch Lightning.