Home

Awesome

<div align="center"> <br> <image src="./imgs/teaser.png", width="600px", height="287px"> <br> </div> <!-- ![img](./imgs/teaser.png) -->

Awesome Segment Anything Awesome

Segment Anything has led to a new breakthrough in the field of Computer Vision (CV), and this repository will continue to track and summarize the research progress of Segment Anything in various fields, including Papers/Projects, etc.

If you find this repository helpful, please consider Stars ⭐ or Sharing ⬆️. Thanks.

News

- 2024.8.16 Add Segment Anything2 and SaLIP.
- 2023.8.29: Update some recent works.
- 2023.5.20: Update document structure and add a robotic-related article. Happy 520 Day!
- 2023.5.4: Add SEEM.
- 2023.4.18: Add job Inpainting Anything and SAM-Track.
- 2023.4.12: An initial version of recent papers or projects.

Contents

Papers/Projects

Basemodel Papers

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
CLIPimgarXivColabCodeOpenAIContrastive Language-Image Pre-Training.
OWL-ViTimgECCV2022-CodeGoogleA open-vocabulary object detector.
OvSegimgCVPR2023ProjectCodeMETASegment an image into semantic regions according to text descriptions.
PainterimgCVPR2023-CodeBAAIA Generalist Painter for In-Context Visual Learning.
Grounding DINOimgarXivColab &HuggingfaceCodeIDEAA stronger open-set object detector
Segment AnythingimgimgarXivProject pageCodeMetaA stronger Large model which can be used to generate masks for all objects in an image.
SegGPTimgarXivProject pageCodeBAAISegmenting Everything In Context based on Painter.
Segment Everything Everywhere All at Once (SEEM)imgarXivProject PageCodeMicrosoftSemantic Segmentation with various prompt types.
Segment Everything2imgPaperProject PageCodeMetaA foundation model towards solving promptable visual segmentation in images and videos..

Derivative Papers

Analysis and Expansion of SAM

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
CLIP_SurgeryimgarXivDemoCodeHKUSTThis work about SAM based on CLIP's explainability to achieve text to mask without manual points.
GenSAMimgarXivProject PageCodeQMULThis work relaxes the requirement for instance-specific prompts in SAM.
Segment Anything Is Not Always PerfectimgarXiv--SamsungThis paper analyzes and discusses the benefits and limitations of SAM.
PerSAMimgarXivProject PageCode-Segment Anything with specific concepts.
Matcher: Segment Anything with One Shot Using All-Purpose Feature Matchingimg1arXiv-Code-One shot semantic segmentation by integrating an all-purpose feature extraction model and a class-agnostic segmentation model.
Segment Anything in High QualityimgarXivProject Page-ETH Zürich & HKUSTHQ-SAM: improve segmentation quality of SAM using learnable High-Quality Output Token.
Detect Any Shadow: Segment Anything for Video Shadow DetectionimgarXiv-CodeUniversity of Science and Technology of ChinaUse SAM to detect initial frames then use an LSTM network for subsequent frames.
Fast Segment AnythingimgarXivProject PageCode-Reformulate the architecture and improve the speed of SAM.
MobileSAM (Faster Segment Anything)imgarXivProject PageCodeKyung Hee Universitymake SAM mobile-friendly by replacing the heavyweight image encoder with a lightweight one.
FoodSAM (Any Food Segmentation)imgarxivProject PageCodeUCASsemantic, instance, panoptic, interactive segmentation on food image.
DefectSAMimgarxiv-CodeZJU, Westlake, UESTC, etc.infrared thermal images, defect detection.
SlimSAMimgarxiv-CodeNUS0.1% Data Makes Segment Anything Slim.

Medical Image Segmentation

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
Segment Anything Model (SAM) for Digital PathologyimgarXiv---SAM + Tumor segmentation/Tissue segmentation/Cell nuclei segmentation.
Segment Anything in Medical Imagesimg1arXiv-Code-A step-by-step tutorial with a small dataset to help you quickly utilize SAM.
SAM Fails to Segment Anything?img1arXiv-Code-SAM-adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and More.
Segment Anything Model for Medical Image Analysis: an Experimental Studyimg1arXiv---Thorough experiments evaluating how SAM performs on 19 medical image datasets.
Medical-SAM-Adapterimg1arXiv-Code-A project to finetune SAM using Adaption for the Medical Imaging.
SAM-Med2dimg1arXiv-CodeSichuan University & Shanghai AI LaboratoryThe most comprehensive studies on applying SAM to medical 2D images
ScribblePrompt-SAMimg1arXivProject PageCodeMIT & MGHFine-tuned SAM on 65 biomedical imaging datasets with scribble, click, and bounding box inputs
SaLIP-arXivProject PageCode-Test-Time Adaptation with SaLIP: A Cascade of SAM and CLIP for Zero-shot
Medical Image Segmentation.

Bioimage Analysis

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
Segment Anything for MicroscopyimgbioRxivDemoCodeUniversity of Göttingen, GermanySegment Anything for Microscopy implements automatic and interactive annotation for microscopy data. It is built on top of Segment Anything and specializes it for microscopy and other bio-imaging data. Its core components are: <ul><li>The micro_sam tools for interactive data annotation with napari.</li><li>The micro_sam library to apply Segment Anything to 2d and 3d data or fine-tune it on your data.</li><li>The micro_sam models that are fine-tuned on publicly available microscopy data.</li> Our goal is to build fast and interactive annotation tools for microscopy data

Inpainting

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
Inpaint Anythingimg1arXiv-CodeUSTC & EITSAM + Inpainting, which is able to remove the object smoothly.
SAM + Stable Diffusion for Text-to-Image Inpaintingimg1-ProjectCodecometGrounding DINO + SAM + Stable Diffusion

Camouflaged Object Detection

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
SAMCOD-arXiv-Code-SAM + Camouflaged object detection (COD) task.

Video Frame Interpolation

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame InterpolationimgarXivProject Page & Interactive DemoCodeShanghai AI Laboratory & Snap Inc.Editable video frame interpolation with SAM.

Low Level Vision

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
Segment Anything in Video Super-resolutionimg1arXiv---The first step to use SAM for low-level vision.
SAM-IQAimg1arXiv-CodeMegviiThe first to introduce the SAM in IQA and demonstrate its strong generalization ability in this domain.

Image Matting

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
Matte AnythingimgimgarXiv-CodeHUST Vision LabAn interactive natural image matting system with excellent performance for both opaque and transparent objects
Matting Anythingimg1arXivProject pageCodeSHI LabsLeverage feature maps from SAM and adopts a Mask-to-Matte module to predict the alpha matte.

Robotic

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
Instruct2Actimg1arXiv-CodeOpenGVLabA SAM application in the Robotic field.

Bioinformatics

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
IAMSAMimg1bioRxiv-CodePortrai Inc.A SAM application for the analysis of Spatial Transcriptomics.

3D

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
Point-SAMimgarXivPageCodeUCSDAn open-world 3D native promptable point-cloud segmentation method.
SAMPro3Dimg2arXivPageCodeCUHKSZ, MSRAA novel method to segment any 3D indoor scenes by applying the SAM to 2D frames, without need any training, tuning, distillation or 3D pretrained networks.
Sealimg1arXivPageCode-A framework capable of leveraging 2D vision foundation models for self-supervised learning on large-scale 3D point clouds.
TomoSAMimgarXivVideo TutorialCode-An extension of 3D Slicer using the SAM to aid the segmentation of 3D data from tomography or other imaging techniques.
SegmentAnythingin3DimgarXivProjectCode-A novel framework to Segment Anything in 3D, named SA3D.

Remote Sensing

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
RSPrompterimgarXivProject PageCodeBeihang UniversityAn automated instance segmentation approach for remote sensing images based on the SAM.
SAM-CDimgarXiv-CodePLA Information Engineering UniversityA sample-efficient change detection framework that employs SAM as the visual encoder.
SAM-Road: Segment Anything Model for Road Network Graph ExtractionimgarXiv-CodeCarnegie Mellon UniversityA simple and fast method applying SAM for vectorized large-scale road network graph extraction. It reaches state-of-the-art accuracy while being 40 times faster.

Tracking

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
Follow Anythingimg1arXivPageCodeMIT, Harvard Universityan open-vocabulary and multimodal model to detects, tracks, and follows any objects in real-time.
Track-AnythingVideoarXiv-CodeMIT, Harvard Universityan open-vocabulary and multimodal model to detects, tracks, and follows any objects in real-time.
SAM-TrackVideoarXiv-CodeMIT, Harvard UniversityA framework called Segment And Track Anything (SAMTrack) that allows users to precisely and effectively segment and track any object in a video.

Audio-visual Localization and Segmentation

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
AV-SAMimg1arXiv-CodeCMUA simple yet effective audio-visual localization and segmentation framework based on the SAM.

Adversarial Attacks

TitlePresentationPaper pageProject pageCode baseAffiliationDescription
Attack-SAM-arXiv--KAISTThe
first work of conduct a comprehensive investigation on how to attack SAM with adversarial
examples.

Derivative Projects

Image Segmentation task

TitlePresentationProject pageCode baseAffiliationDescription
Grounded Segment AnythingimgColab & HuggingfaceCode-Combining Grounding DINO and Segment Anything
GroundedSAM Anomaly Detectionimg-Code-Grounding DINO + SAM to segment any anomaly.
Semantic Segment Anythingimg-CodeFudanA dense category annotation engine.
Magic Copyimg-Code-Magic Copy is a Chrome extension that uses SAM.
YOLO-World + EfficientViT SAMimg🤗 HuggingFace SpaceCode-Efficient open-vocabulary object detection and segmentation with YOLO-World + EfficientViT SAM
Segment Anything with Clipimg🤗 HuggingFace SpaceCode-SAM + CLIP
SAM-Clipimg-Code-SAM + CLIP.
Prompt Segment Anythingimg-Code-SAM + Zero-shot Instance Segmentation.
RefSAM--Code-Evaluating the basic performance of SAM on the Referring Image segmentation task.
SAM-RBoximg-Code-An implementation of SAM for generating rotated bounding boxes with MMRotate.
Open Vocabulary Segment Anythingimg1-Code-An interesting demo by combining OWL-ViT of Google and SAM.
SegDrawerimg1img-Code-Simple static web-based mask drawer, supporting semantic drawing with SAM.
AnyLabelingYoutubeDemoCode-SAM + Labelme + LabelImg + Auto-labeling.
ISAT with segment anythingYoutubeDemo BiliBili DemoCode-Labeling tool by SAM(segment anything model),supports SAM, sam-hq, MobileSAM EdgeSAM etc.
Annotation Anything Pipelineimg-Code-GPT + SAM.
Roboflow Annotateroboflow-sam-optimized-fasterAppBlogRoboflowSAM-assisted labeling for training computer vision models.
SALTimg-Code-A tool that adds a basic interface for image labeling and saves the generated masks in COCO format.]
SAM U Specifyimg-Code-Use SAM and CLIP model to segment unique instances you want.]
SAM web UIimgAppCode-This is a new web interface for the SAM.
Finetune Anythingimg-Code-A class-aware one-stage tool for training fine-tuning models based on SAM.
NanoSAMimg-CodeNVIDIAA distilled Segment Anything (SAM) model capable of running real-time with NVIDIA TensorRT.

Video Segmentation task

TitlePresentationProject pageCode baseAffiliationDescription
MetaSegimgHuggingFaceCode-SAM + Video.
SAM-TrackVideoYoutubeDemoCodeZhejiang UniversityThis project, which is based on SAM and DeAOT, focuses on segmenting and tracking objects in videos.

Medical image Segmentation task

TitlePresentationProject pageCode baseAffiliationDescription
SAM in NapariVideo-Code-Segment anything with Napari integration of SAM.
SAM Medical Imagingimg-Code-SAM for Medical Imaging.

Inpainting task

TitlePresentationProject pageCode baseAffiliationDescription
SegAnythingProimg-Code-SAM + Inpainting/Replacing.

3D task

TitlePresentationProject pageCode baseAffiliationDescription
3D-Boximg-Code-SAM is extended to 3D perception by combining it with VoxelNeXt.
Anything 3DNovel Viewimg-Code-SAM + Zero 1-to-3.
Any 3DFaceimgimg-Code-SAM + HRN.
Segment Anything 3Dimg-CodePointceptExtending Segment Anything to 3D perception by transferring the segmentation information of 2D images to 3D space

Image Generation task

TitlePresentationProject pageCode baseAffiliationDescription
Edit Anythingimg-Code-Edit and Generate Anything in an image.
Image Edit Anythingimg-Code-Stable Diffusion + SAM.
SAM for Stable Diffusion Webuiimg-Code-Stable Diffusion + SAM.

Remote Sensing task

TitlePresentationProject pageCode baseAffiliationDescription
Earth Observation ToolsimgColabCode-SAM + Remote Sensing.

Moving Object Detection task

TitlePresentationProject pageCode baseAffiliationDescription
Moving Object Detectionimg-Code-SAM + Moving Object Detection.

OCR task

TitlePresentationProject pageCode baseAffiliationDescription
OCR-SAMimgBlogCode-Optical Character Recognition with SAM.

front-end framework

SAMJS

TitlePresentationProject pageCode baseAffiliationDescription
SAMJSsamjsdemoCode-JS SDK for SAM, Support remote sensing data segmentation and vectorization

Acknowledgement

Some of the presentations in this repository are borrowed from the original author, and we are very thankful for their contribution.

License

This project is released under the MIT license. Please see the LICENSE file for more information.