Home

Awesome

<div align="center"> <img src="resources/logo.png" width="400"/> </div> <div>&nbsp;</div>

Introduction

This repository contains the implementation for CVPR2024 paper

FocSAM: Delving Deeply into Focused Objects in Segmenting Anything

Demo

The following GIF animations display a comparison of interactive segmentation results between SAM and our FocSAM. Notably, FocSAM demonstrates a remarkably stable performance with significantly less fluctuation in IoU compared to SAM, across various datasets.

<img src="resources/result-2008_002715.gif" width="250" height="250"/><img src="resources/result-2009_002177.gif" width="250" height="250"/><img src="resources/result-2009_004203.gif" width="250" height="250"/>

<img src="resources/result-2010_000197.gif" width="250" height="250"/><img src="resources/result-cable_cut_inner_insulation_007.gif" width="250" height="250"/><img src="resources/result-capsule_squeeze_004.gif" width="250" height="250"/>

<img src="resources/result-COD10K-CAM-1-Aquatic-13-Pipefish-836.gif" width="250" height="250"/><img src="resources/result-COD10K-CAM-3-Flying-53-Bird-3089.gif" width="250" height="250"/><img src="resources/result-COD10K-CAM-3-Flying-53-Bird-3141.gif" width="250" height="250"/>

<img src="resources/result-grid_bent_005.gif" width="250" height="250"/><img src="resources/result-transistor_bent_lead_007.gif" width="250" height="250"/><img src="resources/result-zipper_combined_000.gif" width="250" height="250"/>

Installation

For detailed installation instructions, please refer to INSTALL.

Alternatively, ensure you have Python version 3.11.0 set up in your environment. Then, install all dependencies by running the following command in your terminal:

bash scripts/install.sh

Dataset Preparation

For detailed dataset preparation instructions, please refer to DATASETS.

Model Weights Download and Conversion

SAM Pre-trained Weights

python tools/model_converters/samvit2mmclickseg.py pretrain/sam_pretrain_vit_huge.pth

FocSAM Pre-trained Weights

Evaluating the Model

export PYTHONPATH=.
python tools/test_no_viz.py configs/_base_/eval_davis.py work_dirs/focsam/focsam_vit_huge_eval/iter_160000.pth
bash tools/dist_test.sh configs/_base_/eval_davis.py work_dirs/focsam/focsam_vit_huge_eval/iter_160000.pth 4
export PYTHONPATH=.
CUDA_VISIBLE_DEVICES= python tools/test_no_viz.py configs/_base_/eval_davis.py work_dirs/focsam/focsam_vit_huge_eval/iter_160000.pth
configs/_base_/eval_sbd.py  # for SBD
configs/_base_/eval_grabcut.py  # for GrabCut 
configs/_base_/eval_berkeley.py  # for Berkeley
configs/_base_/eval_mvtec.py  # for MVTec
configs/_base_/eval_cod10k.py  # for COD10K

Training the Model

Training SAM Decoder

export PYTHONPATH=.
python tools/train.py configs/sam/coco_lvis/train_colaug_coco_lvis_1024x1024_320k.py
bash tools/dist_train.sh configs/sam/coco_lvis/train_colaug_coco_lvis_1024x1024_320k.py 4
export PYTHONPATH=.
CUDA_VISIBLE_DEVICES= python tools/train.py configs/sam/coco_lvis/train_colaug_coco_lvis_1024x1024_320k.py

Training FocSAM Refiner

export PYTHONPATH=.
python tools/train.py configs/focsam/coco_lvis/train_colaug_coco_lvis_1024x1024_160k.py
bash tools/dist_train.sh configs/focsam/coco_lvis/train_colaug_coco_lvis_1024x1024_160k.py 4
export PYTHONPATH=.
CUDA_VISIBLE_DEVICES= python tools/train.py configs/focsam/coco_lvis/train_colaug_coco_lvis_1024x1024_160k.py

License

This project is licensed under the MIT License - see the LICENSE file for details.