Awesome

ShowAnything

Edit and Generate Anything In Any Image and Video.

Motivation of this repo: Segment Anything offers powerful perception capabilities and interfaces with points and boxes. We believe that the interface ability has the potential to greatly drive the development of generation and editing applications. We will be posting some application developments from our lab here that are compatible with both Segment Anything and Generation, thanks.

ToDo

Customization (LoRA or Dreambooth)
Video editing with point interaction
Hugging Face Demo
...

Installation

Please follow the segment anything to creating the environment and download the model checkpoint (vit_l, vit_l or vit_b)

Image

Hugging Face Demo is now available, but please note that it may run slowly due to being currently executed on a low-end CPU.

Edit with point click

The ultimate goal of this task is to achieve control over the generation of complex scenes, such as dense crowds and department stores.

See the here. Using points to edit and control. Enjoy it!

Object Merge

Object Remove

Video

Zero-shot Editing Anything in Any Video

Replace the person in the video:

Customize the clothes of the person in the video:

More results:

<table class="center"> <tr> <td style="text-align:center;">Input Video</td> <td style="text-align:center;" colspan="3">Output Video</td> </tr> <tr> <td><img src="./assets/video/original.gif"></td> <td><img src="./assets/video/A soldier is dancing_crop.gif"></td> <td><img src="./assets/video/Donald Trump is dancing_crop.gif"></td> <td><img src="./assets/video/Iron Man is dancing_crop.gif"></td> </tr> <tr> <td width=25% style="text-align:center;color:gray;">"Input Video"</td> <td width=25% style="text-align:center;">"A soldier is dancing"</td> <td width=25% style="text-align:center;">"Donald Trump is dancing"</td> <td width=25% style="text-align:center;">"Iron Man is dancing"</td> </tr> </table> <table class="center"> <tr> <td style="text-align:center;">Input Video</td> <td style="text-align:center;" colspan="3">Output Video</td> </tr> <tr> <td><img src="./assets/video/original.gif"></td> <td><img src="./assets/video/A man is dancing in skirt and wearing black stockings.gif"></td> <td><img src="./assets/video/A man is dancing in jeans.gif"></td> <td><img src="./assets/video/A man is dancing in shorts.gif"></td> </tr> <tr> <td width=25% style="text-align:center;color:gray;">"Input Video"</td> <td width=25% style="text-align:center;">"A man is dancing in skirt and wearing black stockings"</td> <td width=25% style="text-align:center;">"A man is dancing in jeans"</td> <td width=25% style="text-align:center;">"A man is dancing in shorts"</td> </tr> </table>

Transform Image Into Unique Paragraph

Image to paragraph is available, please refer to Image2Paragraph, and enjoy!

Segment Anything for 3D

Image to 3D using SAM is available, please refer to Anything3D, and enjoy!

https://user-images.githubusercontent.com/37300524/232232693-ad3b0329-ab83-48e1-8f7e-dca9dff7b32e.mp4

Acknowledgements

Citation

If you find this project helpful for your research, please consider citing the following BibTeX entry.

@article{kirillov2023segany,
  title={Segment Anything}, 
  author={Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{\'a}r, Piotr and Girshick, Ross},
  journal={arXiv:2304.02643},
  year={2023}
}