Home

Awesome

Awesome-Anything

Awesome Awesome Anything

A curated list of general AI methods for Anything: AnyObject, AnyGeneration, AnyModel, AnyTask, etc.

Contributions are welcome!

AnyObject

Title & AuthorsIntroUseful Links
Star <br> Segment Anything <br> Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick <br> > Meta Research <br> > Preprint'23 <br><br> [Segment Anything (Project)]intro[Github] <br> [Page] <br> [Demo]
Star <br> OVSeg: Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP <br> Feng Liang, Bichen Wu, Xiaoliang Dai, Kunpeng Li, Yinan Zhao, Hang Zhang, Peizhao Zhang, Peter Vajda, Diana Marculescu <br> > Meta Research <br> > Preprint'23 <br><br> [OVSeg (Project)]<img width="855" alt="image" src="https://user-images.githubusercontent.com/18592211/232279307-cf00ebe2-0751-48dc-b4ac-47ff343c28dc.png">[Github] <br> [Page]
Star <br> Learning to Segment Every Thing <br> Ronghang Hu, Piotr Dollar, Kaiming He, Trevor Darrell, Ross Girshick<br> > UC Berkeley, FAIR <br> > CVPR'18 <br><br> [seg_every_thing (Project)]<img width="989" alt="image" src="https://user-images.githubusercontent.com/18592211/232575250-4e6fa0cf-507b-40bb-b71b-c0bcc2a85aaf.png">[Github] <br> [Page]
Star <br> Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection <br> Shilong Liu and Zhaoyang Zeng and Tianhe Ren and Feng Li and Hao Zhang and Jie Yang and Chunyuan Li and Jianwei Yang and Hang Su and Jun Zhu and Lei Zhang <br> > IDEA-Research <br> > Preprint'23 <br><br> [Grounded-SAM, GroundingDINO (Project)]intro[Github] <br> [Demo]
Star <br> SegGPT: Segmenting Everything In Context <br> Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang <br> > BAAI-Vision <br> > Preprint'23 <br><br>[SegGPT (Project)]<img width="903" alt="image" src="https://user-images.githubusercontent.com/18592211/230897227-c797f375-a44d-4536-a06b-41f0d9f4dbc4.png">[Github]
V3Det: Vast Vocabulary Visual Detection Dataset <br> Jiaqi Wang, Pan Zhang, Tao Chu, Yuhang Cao, Yujie Zhou, Tong Wu, Bin Wang, Conghui He, Dahua Lin <br> > Shanghai AI Laboratory, CUHK <br> > Preprint'23image--
Star <br> segment-anything-video (Project) <br> Kadir Narintro[Github]
Star <br> Towards Segmenting Anything That Moves <br> Achal Dave, Pavel Tokmakov, Deva Ramanan <br> > ICCV'19 Workshop <br><br> [segment-any-moving (Project)]<img src="http://www.achaldave.com/projects/anything-that-moves/videos/ZXN6A-tracked-with-objectness-trimmed.gif" width="32%" /><img src="http://www.achaldave.com/projects/anything-that-moves/videos/c95cd17749.gif" width="32%" /><img src="http://www.achaldave.com/projects/anything-that-moves/videos/e0bdb5dfae.gif" width="32%" />[Github]
Star <br> Semantic Segment Anything <br> Jiaqi Chen, Zeyu Yang, Li Zhang <br><br> [Semantic-Segment-Anything (Project)]<img width="903" alt="image" src="https://github.com/fudan-zvg/Semantic-Segment-Anything/blob/main/figures/SSA_motivation.png">[Github]
Star <br> Grounded Segment Anything: From Objects to Parts (Project) <br> Peize Sun and Shoufa Chenintro[Github]
Star <br> GroundedSAM-zero-shot-anomaly-detection (Project) <br> Yunkang Cao<img width="677" alt="image" src="https://user-images.githubusercontent.com/18592211/231068964-ddeae0ea-4e83-40d6-b73e-2811d46f808d.png">[Github]
Star <br> Segment Anything Labelling Tool (SALT) (Project) <br> Anurag Ghoshintro[Github]
Star <br> Prompt-Segment-Anything (Project) <br> Rockeyintro[Github]
Star <br> SAM-RBox (Project) <br> Qingyun Liintro[Github]
Star <br> VISAM (Project) <br> Feng Yan, Weixin Luo, Yujie Zhong, Yiyang Gan, Lin Maintro[Github] <br>
Star <br> Segment Anything EO tools: Earth observation tools for Meta AI Segment Anything (Project) <br> Aliaksandr Hancharenka, Alexander Chichiginintro[Github]
Star <br> napari-segment-anything: Segment Anything Model (SAM) native Qt UI (Project) <br> Jordão Bragantini, Kyle I S Harrington, Ajinkya Kulkarni<img width="658" alt="image" src="https://user-images.githubusercontent.com/18592211/231413725-661fb2a9-1951-40b1-8239-6896eeb7eb4c.png">[Github]
Star <br> SAM-Medical-Imaging: Segment Anything Model (SAM) native Qt UI (Project) <br> Jordão Bragantini, Kyle I S Harrington, Ajinkya Kulkarniimage[Github]
Star <br> OCR-SAM: Combining MMOCR with Segment Anything & Stable Diffusion. (Project) <br> Zhenhua Yang, Qing Jiangimage[Github]
Star <br> segment-anything-u-specify: using sam+clip to segment any objs u specify with text prompts. (Project) <br> MaybeShewill-CVimage[Github]
Star <br> Segment Everything Everywhere All at Once <br> Xueyan Zou, Jianwei Yang, Hao Zhang, Feng Li, Linjie Li, Jianfeng Gao, Yong Jae Lee <br><br> [SEEM (Project)]image[Github]
Star <br> SegDrawer: Simple static web-based mask drawer (Project) <br> Harryimage[Github]
Star <br> Magic Copy: a Chrome extension (Project) <br> Harry<img width="546" alt="image" src="https://user-images.githubusercontent.com/18592211/232190851-1dc85342-3d50-42a7-a8e2-f45c4c862d70.png">[Github]
Star <br> Track Anything: Segment Anything Meets Videos <br> Jinyu Yang, Mingqi Gao, Zhe Li, Shang Gao, Fangjing Wang, Feng Zheng <br><br> [Track-Anything (Project)]Image[Github] <br> [Demo]
Star <br> Count Anything (Project) <br> Liqi Yan<img width="549" alt="image" src="https://user-images.githubusercontent.com/18592211/232305466-ad68546f-b5b1-4c2a-a543-78dea66c7151.png">[Github]
Star <br> Segment-and-Track-Anything (Project) <br> Zongxin Yang<img width="954" alt="image" src="https://user-images.githubusercontent.com/18592211/232711476-895699e5-fc11-4624-a9fa-e34d84438342.png">[Github]
Star <br> Pose for Everything: Towards Category-Agnostic Pose Estimation <br> Lumin Xu*, Sheng Jin*, Wang Zeng, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang <br> > CUHK, SenseTime <br> > ECCV'22 Oral <br><br> [Pose-for-Everything (Project)]image[Github]
Star <br>Relate Anything Model (Project) <br> Zujin Guo*, Bo Li*, Jingkang Yang*, Zijian Zhou*, Ziwei Liu <br> > MMLab@NTU <br> > VisCom Lab, KCL/TongJiintroGithub
Star <br>SegmentAnyRGBD (Project) <br> Jun Cen, Yizheng Wu, Xingyi Li, Jingkang Yang, Yixuan Pei, Lingdong Kong <br> > Visual Intelligence Lab@HKUST, <br> > HUST, <br> > MMLab@NTU, <br> > Smiles Lab@XJTU, <br> > NUSintroGithub
<br>Retrieve Any Object via Prompt-based Tracking <br> Pha Nguyen, Kha Gia Quach, Kris Kitani, Khoa Luu <br> > CVIU@UArk, <br> > pdActive Inc., <br> > RI@CMUintro[ArXiv] <br> [Page]
Star <br>FoodSAM (Project) <br> Xing Lan, Jiayi Lyu, Hanyu Jiang, Kun Dong, Zehai Niu, Yi Zhang, Jian Xue <br> > UCASintro[Github] <br> [Page] <br> [ArXiv]

<br><br>

AnyGeneration

Title & AuthorsIntroUseful Links
Star <br> High-Resolution Image Synthesis with Latent Diffusion Models <br> Robin Rombach and Andreas Blattmann and Dominik Lorenz and Patrick Esser and Björn Ommer <br> > LMU München, Runway ML <br> > CVPR'22 <br><br> [Stable-Diffusion (Project)]intro[Github] <br> [Page] <br> [Demo]
Star <br> Adding Conditional Control to Text-to-Image Diffusion Models <br> Lvmin Zhang, Maneesh Agrawala <br> > Stanford University <br> > Preprint'23 <br><br> [ControlNet (Project)]intro[Github] <br> [Demo]
GigaGAN: Large-scale GAN for Text-to-Image Synthesis <br> Minguk Kang, Jun-Yan Zhu, Richard Zhang, Jaesik Park, Eli Shechtman, Sylvain Paris, Taesung Park <br> > POSTECH, Carnegie Mellon University, Adobe Research <br> > CVPR'23<img alt="image" src="https://user-images.githubusercontent.com/18592211/230898538-84da51ee-f686-422d-9892-c1c47ab10b75.png"></img>[Page]
Star <br> Inpaint-Anything: Segment Anything Meets Image Inpainting (Project) <br> Tao Yuintro[Github]
Star <br> IEA: Image Editing Anything (Project) <br> Zhengcong Feiintro[Github]
Star <br> EditAnything (Project) <br> Shanghua Gao, Pan Zhouintro[Github]
Star <br> Segment Anything for Stable Diffusion Webui (Project) <br> Chengsong Zhang<img width="659" alt="image" src="https://user-images.githubusercontent.com/18592211/231410895-eac4c4b6-ee61-487b-9333-8dcd1befc610.png">[Github]
Star <br> Segment Anything with Clip (Project) <br> Jinwoo Parkintro[Github]
Star <br>ShowAnything: Edit and Generate Anything In Image and Video (Project) <br> Showlab, NUSintroGithub
Star <br>Transfer-Any-Style: About An interactive demo based on Segment-Anything for style transfer (Project) <br> LV-Lab, NUSintroGithub
Star <br>Anything To Image: Generate image from anything with ImageBind and Stable Diffusion (Project) <br> Zeqiang-LaiintroGithub
Star <br> Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame Interpolation <br> Zhihang Zhong, Gurunandan Krishnan, Xiao Sun, Yu Qiao, Sizhuo Ma, Jian Wang <br> > Shanghai AI Laboratory, Snap Inc. <br> > Preprint'23intro[Github] <br> [Page] <br> [ArXiv]

<br><br>

Any3D

Title & AuthorsIntroUseful Links
Star <br> OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation <br> Zhening Huang, Xiaoyang Wu, Xi Chen, Hengshuang Zhao, Lei Zhu, Joan Lasenby <br> > Cambridge, HKU, HKUST <br><br> [OpenIns3D]image[Github] <br> [Page]
Star <br>Anything-3D: Segment-Anything + 3D, Let's lift the anything to 3D (Project) <br> LV-Lab, NUSintro <br> intro2Github
Star <br>SAM 3D Selector: Utilizing segment-anything to help the region selection of 3D point cloud or mesh. (Project) <br> NexuslrfintroGithub
Star <br> 3D-Box via Segment Anything. (Project) <br> dvlab-researchimage[Github]
Star <br> SAM3D: Segment Anything in 3D Scenes <br> Yunhan Yang, Xiaoyang Wu, Tong He, Hengshuang Zhao, Xihui Liu <br> > Shanghai AI Laboratory, HKU <br><br> [SAM3D: Segment Anything in 3D Scenes (Project)]image[Github]

<br><br>

AnyModel

Title & AuthorsIntroUseful Links
Star <br> DepGraph: Towards Any Structural Pruning <br> Gongfan Fang, Xinyin Ma, Mingli Song, Michael Bi Mi, Xinchao Wang <br> > Learning and Vision Lab @ NUS<br> > CVPR'23 <br><br> [Torch-Pruning (Project)]intro[Github] <br> [Demo]
Star <br> MQBench: Towards Reproducible and Deployable Model Quantization Benchmark <br> Yuhang Li and Mingzhu Shen and Jian Ma and Yan Ren and Mingxin Zhao and Qi Zhang and Ruihao Gong and Fengwei Yu and Junjie Yan <br> > SenseTime Research <br> > NeurIPS'21 <br><br> [MQBench (Project)]intro[Github] <br> [Page]
Star <br> OTOv2: Automatic, Generic, User-Friendly <br> Tianyi Chen, Luming Liang, Tianyu Ding, Ilya Zharkov <br> > Microsoft <br> > ICLR'23 <br><br> [Only Train Once (Project)]intro[Github]
Star <br> Deep Model Reassembly <br> Xingyi Yang, Daquan Zhou, Songhua Liu, Jingwen Ye, Xinchao Wang <br> LV Lab, NUS <br> > NeurIPS'22 <br><br> [Deep Model Reassembly (Project)] <br>intro[Github] <br> [Page]

<br><br>

AnyTask

Title & AuthorsIntroUseful Links
Star <br> HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace <br> Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, Yueting Zhuang <br> > Zhejiang University, MSRA <br> Preprint'23 <br><br> [Jarvis (Project)]<img src="https://github.com/microsoft/JARVIS/raw/main/assets/overview.jpg"><img>[Github] <br> [Demo]
TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs <br> Yaobo Liang, Chenfei Wu, Ting Song, Wenshan Wu, Yan Xia, Yu Liu, Yang Ou, Shuai Lu, Lei Ji, Shaoguang Mao, Yun Wang, Linjun Shou, Ming Gong, Nan Duan <br> > Microsoft <br> > > Preprint'23intro[Github]
Star <br> Generalized Decoding for Pixel, Image and Language <br> Xueyan Zou, Zi-Yi Dou, Jianwei Yang, Zhe Gan, Linjie Li, Chunyuan Li, Xiyang Dai, Harkirat Behl, Jianfeng Wang, Lu Yuan, Nanyun Peng, Lijuan Wang, Yong Jae Lee, Jianfeng Gao <br> > Microsoft <br> > CVPR'23 <br><br> [X-Decoder (Project)]intro[Github] <br> [Page] <br> [Demo]
Star <br> Pre-Trained Image Processing Transformer <br> Chen, Hanting and Wang, Yunhe and Guo, Tianyu and Xu, Chang and Deng, Yiping and Liu, Zhenhua and Ma, Siwei and Xu, Chunjing and Xu, Chao and Gao, Wen <br> > Huawei-Noah <br> > CVPR'21 <br><br> [Pretrained-IPT (Project)]intro[Github]
Star <br>OpenAGI: When LLM Meets Domain Experts <br> Yingqiang Ge, Wenyue Hua, Jianchao Ji, Juntao Tan, Shuyuan Xu, Yongfeng Zhang <br> > Rutgers University <br> > Preprint'23 <br><br> [OpenAGI (Project)]introGithub

<br><br>

AnyX

Title & AuthorsIntroUseful Links
Star <br> Caption Anything: Interactive Image Description with Diverse Multimodal Controls <br> Teng Wang, Jinrui Zhang, Junjie Fei, Hao Zheng, Yunlong Tang, Zhe Li, Mingqi Gao, Shanshan Zhao <br> > SUSTech VIP Lab <br> > Preprint'23 <br> <br> Caption Anything (Project)intro[Github] <br> [Demo]
Star <br>Image2Paragraph:Transform Image into Unique Paragraph (Project) <br> Jinpeng WangintroGithub
...

<br><br>

Paper List for Anything AI

A paper list for Anything AI

AnyObject

PaperFirst AuthorVenueTopic
Segment AnythingAlexander KirillovPreprint'23Segmentation
Learning to Segment Every ThingRonghang HuCVPR'18
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object DetectionShilong LiuPreprint'23Grounding+Detection
SegGPT: Segmenting Everything In ContextXinlong WangPreprint'23Segmentation
V3Det: Vast Vocabulary Visual Detection DatasetJiaqi WangPreprint'23Dataset
Pose for Everything: Towards Category-Agnostic Pose EstimationLumin XuECCV'22 OralPose
Type-to-Track: Retrieve Any Object via Prompt-based TrackingPha NguyenNeurIPS'23Grounding+Tracking

AnyGeneration

PaperFirst AuthorVenueTopic
High-Resolution Image Synthesis with Latent Diffusion ModelsRobin RombachCVPR'22Text-to-Image Generation
Adding Conditional Control to Text-to-Image Diffusion ModelsLvmin ZhangPreprint'23Controlllable Generation
GigaGAN: Large-scale GAN for Text-to-Image SynthesisMinguk KangCVPR'23Large-scale GAN
Inpaint Anything: Segment Anything Meets Image InpaintingTao YuPreprint'23Inpainting

AnyModel

PaperFirst AuthorVenueTopic
DepGraph: Towards Any Structural PruningGongfan FangCVPR'23Network Pruning
MQBench: Towards Reproducible and Deployable Model Quantization BenchmarkYuhang LiNeurIPS'21Network Quantization
OTOv2: Automatic, Generic, User-FriendlyTianyi ChenICLR'23Network Pruning
Deep Model ReassemblyXingyi YangNeurIPS'22Model Reuse

AnyTask

PaperFirst AuthorVenueTopic
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFaceYongliang ShenPreprint'23Modelzoo + LLM
TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIsYaobo LiangPreprint'23Modelzoo + LLM
Generalized Decoding for Pixel, Image and LanguageXueyan ZouCVPR'23Multi Tasking
Pre-Trained Image Processing TransformerChen, HantingCVPR'21Low-level Vision