Awesome
Small Object Few-shot Segmentation for Vision-based Industrial Inspection
This is an official PyTorch implementation of the paper Small Object Few-shot Segmentation for Vision-based Industrial Inspection.
@article{zhang2024small,
title={Small Object Few-shot Segmentation for Vision-based Industrial Inspection},
author={Zhang, Zilong and Niu, Chang and Zhao, Zhibin and Zhang, Xingwu and Chen, Xuefeng},
journal={arXiv preprint arXiv:2407.21351},
year={2024}
}
<p align="center">
<img src=assets/paradigm.png width="100%">
</p>
<p align="center">
<img src=assets/SOFS.jpg width="100%">
</p>
We present SOFS to solve problems that various and sufficient defects are difficult to obtain and anomaly detection cannot detect specific defects in Vision-based Industrial Inspection. SOFS can quickly adapt to unseen classes without retraining, achieving few-shot semantic segmentation (FSS) and few-shot anomaly detection (FAD). SOFS can segment the small defects conditioned on the support sets, e.g., it segments the defects with area proportions less than 0.03%. Some visualizations are shown in the figure below.
<p align="center"> <img src=assets/vis.jpg width="100%"> </p>Visualizations under Open Domain
- We show the visualizations of SOFS for Severstal: Steel Defect Detection under the open domain, where SOFS is trained on VISION V1.
Installation
- The default python version is python 3.8.
- Follow the installation of DINO v2, such as xFormers.
- Use the following commands:
pip install -r requirements.txt
Train and test on VISION V1 dataset or Ds spectrum
- Pretrained model prepare: please download DINO v2 ViT-B/14 distilled (without registers) pre-trained model in DINO v2.
- Dataset prepare: please download VISION V1 dataset, the corresponding reference is at here.
- Dataset prepare: please download Ds spectrum dataset, the corresponding reference is at here.
- Replace TRAIN.dataset_path and TEST.dataset_path with your own VISION V1/Ds spectrum dataset path.
- For Ds spectrum dataset, please firstly replace the file name in VISION v1 of Ds spectrum dataset with Capacitor_VISION/Ring_VISION..., then put these folders together including the name in dataset split.
- Replace TRAIN.backbone_checkpoint with the path of pre-trained DINO v2 ViT-B/14 distilled.
- Prepare an empty folder, replace DATASET.vision_data_save_path with the corresponding path.
- Then run the following code:
bash train.sh
- The model trains in each train split and test in the corresponding test split. The result is at the ./log.
- After you run the above command for the first time, replace DATASET.vision_data_save with False and replace DATASET.vision_data_load with True.
Inference
- We provide the model trained on VISION V1 and code for SOFS inference (open-domain test). You can put your own data for open-domain test.
- Please download SOFS model at here (Google Drive) and place it at "./SOFS_model.pth".
Prepare for Your Own Data
- You can refer to the data format in severstal_steel of Open_Domain_Data. The data in severstal_steel are from Severstal: Steel Defect Detection. Our training data do not contain this data, thus this is an open-domain test.
- Your own data should be organized as follows:
|-- Your Own Data (object name)
|-- support
|-- image
|-- mask
|-- query
|-- image
- support contains image fold and mask fold, each image in mask fold contains {0, 255}, 255 indicates the target semantic. image fold in query contains the test image.
Test on your own dataset
- You should replace "severstal_steel" with your own object in DATASET.open_domain_test_object of "./method_config/Open_Domain/SOFS.yaml".
- Then run the following code:
sh test.sh
To-Do List
- Task 1: Release inference code and model (open-domain test).
- Task 2: Release training and test code for different datasets (part).
- Task 3: Release inference for a mixture of defective support samples and normal support samples.
- Task 4: Release online tools.
Acknowledgement
We acknowledge the excellent implementation from DINO v2, HDMNet.
License
The code is released under the CC BY-NC-SA 4.0 license.