Awesome
OpenLRM: Open-Source Large Reconstruction Models
<img src="assets/rendered_video/teaser.gif" width="75%" height="auto"/> <div style="text-align: left"> <img src="assets/mesh_snapshot/crop.owl.ply00.png" width="12%" height="auto"/> <img src="assets/mesh_snapshot/crop.owl.ply01.png" width="12%" height="auto"/> <img src="assets/mesh_snapshot/crop.building.ply00.png" width="12%" height="auto"/> <img src="assets/mesh_snapshot/crop.building.ply01.png" width="12%" height="auto"/> <img src="assets/mesh_snapshot/crop.rose.ply00.png" width="12%" height="auto"/> <img src="assets/mesh_snapshot/crop.rose.ply01.png" width="12%" height="auto"/> </div>News
- [2024.03.13] Update training code and release OpenLRM v1.1.1.
- [2024.03.08] We have released the core blender script used to render Objaverse images.
- [2024.03.05] The Huggingface demo now uses
openlrm-mix-base-1.1
model by default. Please refer to the model card for details on the updated model architecture and training settings. - [2024.03.04] Version update v1.1. Release model weights trained on both Objaverse and MVImgNet. Codebase is majorly refactored for better usability and extensibility. Please refer to v1.1.0 for details.
- [2024.01.09] Updated all v1.0 models trained on Objaverse. Please refer to HF Models and overwrite previous model weights.
- [2023.12.21] Hugging Face Demo is online. Have a try!
- [2023.12.20] Release weights of the base and large models trained on Objaverse.
- [2023.12.20] We release this project OpenLRM, which is an open-source implementation of the paper LRM.
Setup
Installation
git clone https://github.com/3DTopia/OpenLRM.git
cd OpenLRM
Environment
- Install requirements for OpenLRM first.
pip install -r requirements.txt
- Please then follow the xFormers installation guide to enable memory efficient attention inside DINOv2 encoder.
Quick Start
Pretrained Models
- Model weights are released on Hugging Face.
- Weights will be downloaded automatically when you run the inference script for the first time.
- Please be aware of the license before using the weights.
Model | Training Data | Layers | Feat. Dim | Trip. Dim. | In. Res. | Link |
---|---|---|---|---|---|---|
openlrm-obj-small-1.1 | Objaverse | 12 | 512 | 32 | 224 | HF |
openlrm-obj-base-1.1 | Objaverse | 12 | 768 | 48 | 336 | HF |
openlrm-obj-large-1.1 | Objaverse | 16 | 1024 | 80 | 448 | HF |
openlrm-mix-small-1.1 | Objaverse + MVImgNet | 12 | 512 | 32 | 224 | HF |
openlrm-mix-base-1.1 | Objaverse + MVImgNet | 12 | 768 | 48 | 336 | HF |
openlrm-mix-large-1.1 | Objaverse + MVImgNet | 16 | 1024 | 80 | 448 | HF |
Model cards with additional details can be found in model_card.md.
Prepare Images
- We put some sample inputs under
assets/sample_input
, and you can quickly try them. - Prepare RGBA images or RGB images with white background (with some background removal tools, e.g., Rembg, Clipdrop).
Inference
-
Run the inference script to get 3D assets.
-
You may specify which form of output to generate by setting the flags
EXPORT_VIDEO=true
andEXPORT_MESH=true
. -
Please set default
INFER_CONFIG
according to the model you want to use. E.g.,infer-b.yaml
for base models andinfer-s.yaml
for small models. -
An example usage is as follows:
# Example usage EXPORT_VIDEO=true EXPORT_MESH=true INFER_CONFIG="./configs/infer-b.yaml" MODEL_NAME="zxhezexin/openlrm-mix-base-1.1" IMAGE_INPUT="./assets/sample_input/owl.png" python -m openlrm.launch infer.lrm --infer $INFER_CONFIG model_name=$MODEL_NAME image_input=$IMAGE_INPUT export_video=$EXPORT_VIDEO export_mesh=$EXPORT_MESH
Tips
- The recommended PyTorch version is
>=2.1
. Code is developed and tested under PyTorch2.1.2
. - If you encounter CUDA OOM issues, please try to reduce the
frame_size
in the inference configs. - You should be able to see
UserWarning: xFormers is available
ifxFormers
is actually working.
Training
Configuration
- We provide a sample accelerate config file under
configs/accelerate-train.yaml
, which defaults to use 8 GPUs withbf16
mixed precision. - You may modify the configuration file to fit your own environment.
Data Preparation
- We provide the core Blender script used to render Objaverse images.
- Please refer to Objaverse Rendering for other scripts including distributed rendering.
Run Training
-
A sample training config file is provided under
configs/train-sample.yaml
. -
Please replace data related paths in the config file with your own paths and customize the training settings.
-
An example training usage is as follows:
# Example usage ACC_CONFIG="./configs/accelerate-train.yaml" TRAIN_CONFIG="./configs/train-sample.yaml" accelerate launch --config_file $ACC_CONFIG -m openlrm.launch train.lrm --config $TRAIN_CONFIG
Inference on Trained Models
-
The inference pipeline is compatible with huggingface utilities for better convenience.
-
You need to convert the training checkpoint to inference models by running the following script.
python scripts/convert_hf.py --config <YOUR_EXACT_TRAINING_CONFIG> convert.global_step=null
-
The converted model will be saved under
exps/releases
by default and can be used for inference following the inference guide.
Acknowledgement
- We thank the authors of the original paper for their great work! Special thanks to Kai Zhang and Yicong Hong for assistance during the reproduction.
- This project is supported by Shanghai AI Lab by providing the computing resources.
- This project is advised by Ziwei Liu and Jiaya Jia.
Citation
If you find this work useful for your research, please consider citing:
@article{hong2023lrm,
title={Lrm: Large reconstruction model for single image to 3d},
author={Hong, Yicong and Zhang, Kai and Gu, Jiuxiang and Bi, Sai and Zhou, Yang and Liu, Difan and Liu, Feng and Sunkavalli, Kalyan and Bui, Trung and Tan, Hao},
journal={arXiv preprint arXiv:2311.04400},
year={2023}
}
@misc{openlrm,
title = {OpenLRM: Open-Source Large Reconstruction Models},
author = {Zexin He and Tengfei Wang},
year = {2023},
howpublished = {\url{https://github.com/3DTopia/OpenLRM}},
}
License
- OpenLRM as a whole is licensed under the Apache License, Version 2.0, while certain components are covered by NVIDIA's proprietary license. Users are responsible for complying with the respective licensing terms of each component.
- Model weights are licensed under the Creative Commons Attribution-NonCommercial 4.0 International License. They are provided for research purposes only, and CANNOT be used commercially.