Home

Awesome

Segment AnythingπŸ€–οΈ in 3D with NeRFs (SA3D)

Project Page | Arxiv Paper

Segment Anything in 3D with NeRFs
Jiazhong Cen<sup>1*</sup>, Zanwei Zhou<sup>1*</sup>, Jiemin Fang<sup>2,3†</sup>, Chen Yang<sup>1</sup>, Wei Shen<sup>1βœ‰</sup>, Lingxi Xie<sup>2</sup>, Dongsheng Jiang<sup>2</sup>, Xiaopeng Zhang<sup>2</sup>, Qi Tian<sup>2</sup>
<sup>1</sup>AI Institute, SJTU   <sup>2</sup>Huawei Inc   <sup>3</sup>School of EIC, HUST .
*denotes equal contribution
†denotes project lead.

Given a NeRF, just input prompts from one single view and then get your 3D model.
<img src="imgs/SA3D.gif" width="800">

We propose a novel framework to Segment Anything in 3D, named <b>SA3D</b>. Given a neural radiance field (NeRF) model, SA3D allows users to obtain the 3D segmentation result of any target object via only <b>one-shot</b> manual prompting in a single rendered view. The entire process for obtaining the target 3D model can be completed in approximately 2 minutes, yet without any engineering optimization. Our experiments demonstrate the effectiveness of SA3D in different scenes, highlighting the potential of SAM in 3D scene perception.

Update

Overall Pipeline

SA3D_pipeline

With input prompts, SAM cuts out the target object from the according view. The obtained 2D segmentation mask is projected onto 3D mask grids via density-guided inverse rendering. 2D masks from other views are then rendered, which are mostly uncompleted but used as cross-view self-prompts to be fed into SAM again. Complete masks can be obtained and projected onto mask grids. This procedure is executed via an iterative manner while accurate 3D masks can be finally learned. SA3D can adapt to various radiance fields effectively without any additional redesigning.

Installation

git clone https://github.com/Jumpat/SegmentAnythingin3D.git
cd SegmentAnythingin3D

conda create -n sa3d python=3.10
conda activate sa3d
pip install -r requirements.txt

SAM and Grounding-DINO:

# Installing SAM
mkdir dependencies; cd dependencies 
mkdir sam_ckpt; cd sam_ckpt
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
git clone git@github.com:facebookresearch/segment-anything.git 
cd segment-anything; pip install -e .

# Installing Grounding-DINO
git clone https://github.com/IDEA-Research/GroundingDINO.git
cd GroundingDINO/; pip install -e .
mkdir weights; cd weights
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth

Download Data

We now release the configs on these datasets:

Data structure:

<details> <summary> (click to expand) </summary>
data
β”œβ”€β”€ 360_v2             # Link: https://jonbarron.info/mipnerf360/
β”‚   └── [bicycle|bonsai|counter|garden|kitchen|room|stump]
β”‚       β”œβ”€β”€ poses_bounds.npy
β”‚       └── [images|images_2|images_4|images_8]
β”‚
β”œβ”€β”€ nerf_llff_data     # Link: https://drive.google.com/drive/folders/14boI-o5hGO9srnWaaogTU5_ji7wkX2S7
β”‚   └── [fern|flower|fortress|horns|leaves|orchids|room|trex]
β”‚       β”œβ”€β”€ poses_bounds.npy
β”‚       └── [images|images_2|images_4|images_8]
β”‚
└── lerf_data               # Link: https://drive.google.com/drive/folders/1vh0mSl7v29yaGsxleadcj-LCZOE_WEWB
    └── [book_store|bouquet|donuts|...]
        β”œβ”€β”€ transforms.json
        └── [images|images_2|images_4|images_8]
</details>

Usage

Some tips when run SA3D:

Using our Dash based GUI:

TODO List

Some Visualization Samples

SA3D can handle various scenes for 3D segmentation. Find more demos in our project page.

Forward facing360Β°Multi-objects
<img src="imgs/horns.gif" width="200"><img src="imgs/lego.gif" width="200"><img src="imgs/orchid_multi.gif" width="200">

Acknowledgements

Thanks for the following project for their valuable contributions:

Citation

If you find this project helpful for your research, please consider citing the report and giving a ⭐.

@inproceedings{cen2023segment,
      title={Segment Anything in 3D with NeRFs}, 
      author={Jiazhong Cen and Zanwei Zhou and Jiemin Fang and Chen Yang and Wei Shen and Lingxi Xie and Dongsheng Jiang and Xiaopeng Zhang and Qi Tian},
      booktitle    = {NeurIPS},
      year         = {2023},
}