Awesome
3D-Adapter
Official PyTorch implementation of the papers:
3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation <br> Hansheng Chen<sup>1</sup>, Bokui Shen<sup>2</sup>, Yulin Liu<sup>3,4</sup>, Ruoxi Shi<sup>3</sup>, Linqi Zhou<sup>2</sup>, Connor Z. Lin<sup>2</sup>, Jiayuan Gu<sup>3</sup>, Hao Su<sup>3,4</sup>, Gordon Wetzstein<sup>1</sup>, Leonidas Guibas<sup>1</sup><br> <sup>1</sup>Stanford University, <sup>2</sup>Apparate Labs, <sup>3</sup>UCSD, <sup>4</sup>Hillbot <br> [Project page] [🤗Demo] [Paper]
Generic 3D Diffusion Adapter Using Controlled Multi-View Editing <br> Hansheng Chen<sup>1</sup>, Ruoxi Shi<sup>2</sup>, Yulin Liu<sup>2</sup>, Bokui Shen<sup>3</sup>, Jiayuan Gu<sup>2</sup>, Gordon Wetzstein<sup>1</sup>, Hao Su<sup>2</sup>, Leonidas Guibas<sup>1</sup><br> <sup>1</sup>Stanford University, <sup>2</sup>UCSD, <sup>3</sup>Apparate Labs <br> [Project page] [🤗Demo] [Paper]
https://github.com/user-attachments/assets/6cba3a92-04fe-46ee-88ca-e6dfe5443c36
Todos
- Release GRM-based 3D-Adapters (unfortunately, we cannot release these models before the official release of GRM)
Installation
The code has been tested in the environment described as follows:
- Linux (tested on Ubuntu 20 and above)
- CUDA Toolkit 11.8 and above
- PyTorch 2.1 and above
- FFmpeg, x264 (optional, for exporting videos)
Other dependencies can be installed via pip install -r requirements.txt
.
An example of installation commands is shown below (you may change the CUDA version yourself):
# Export the PATH of CUDA toolkit
export PATH=/usr/local/cuda-12.1/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64:$LD_LIBRARY_PATH
# Create conda environment
conda create -y -n mvedit python=3.10
conda activate mvedit
# Install FFmpeg (optional)
conda install -c conda-forge ffmpeg x264
# Install PyTorch
conda install pytorch==2.1.2 torchvision==0.16.2 pytorch-cuda=12.1 -c pytorch -c nvidia
# Clone this repo and install other dependencies
git clone https://github.com/Lakonik/MVEdit && cd MVEdit
pip install -r requirements.txt
This codebase also works on Windows systems if the environment is configured correctly. Please refer to Issue #8 for more information about the environment setup on Windows.
Inference
We recommend using the Gradio Web UI and its APIs for inference. A GPU with at least 24GB of VRAM is required to run the Web UI.
Web UI
Run the following command to start the Web UI:
python app.py --unload-models
The Web UI will be available at http://localhost:7860. If you add the --share
flag, a temporary public URL will be generated for you to share the Web UI with others.
All models will be automatically loaded on demand. The first run will take a very long time to download the models. Check your network connection to GitHub, Google Drive and Hugging Face if the download fails.
To view other options, run:
python app.py -h
API
After starting the Web UI, the API docs will be available at http://localhost:7860/?view=api. The docs are automatically generated by Gradio, and the data types and default values may be incorrect. Please use the default values in the Web UI as a reference.
Please refer to our examples for API usage with python.
Training
Optimization-based 3D-Adapters (a.k.a. MVEdit adapters) adopt only off-the-shelf models and require no further training.
The training code for GRM-based 3D-Adapters will be released after the official release of GRM.
Acknowledgements
This codebase is built upon the following repositories:
- Base library modified from SSDNeRF
- NeRF renderer and DMTet modified from Stable-DreamFusion
- Gaussian Splatting renderer modified from 3DGS and Differential Gaussian Rasterization
- Mesh I/O modified from DreamGaussian
- GRM for Gaussian reconstruction
- Zero123++ for image-to-3D initialization
- IP-Adapter for extra conditioning
- TRACER for background removal
- LoFTR for pose estimation in image-to-3D
- Omnidata for normal prediction in image-to-3D
- Image Packer for mesh preprocessing
Citation
@misc{3dadapter2024,
title={3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation},
author={Hansheng Chen and Bokui Shen and Yulin Liu and Ruoxi Shi and Linqi Zhou and Connor Z. Lin and Jiayuan Gu and Hao Su and Gordon Wetzstein and Leonidas Guibas},
year={2024},
eprint={2410.18974},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2410.18974},
}
@misc{mvedit2024,
title={Generic 3D Diffusion Adapter Using Controlled Multi-View Editing},
author={Hansheng Chen and Ruoxi Shi and Yulin Liu and Bokui Shen and Jiayuan Gu and Gordon Wetzstein and Hao Su and Leonidas Guibas},
year={2024},
eprint={2403.12032},
archivePrefix={arXiv},
primaryClass={cs.CV}
}