Home

Awesome

SyncDreamer

SyncDreamer: Generating Multiview-consistent Images from a Single-view Image

Project page | Paper | Live Demo

News

Preparation for inference

  1. Install packages in requirements.txt. We test our model on a 40G A100 GPU with 11.1 CUDA and 1.10.2 pytorch. But inference on GPUs with smaller memory (=10G) is possible.
conda create -n syncdreamer
conda activate syncdreamer
pip install -r requirements.txt
  1. Download checkpoints here.
  2. A docker env can be found at https://hub.docker.com/repository/docker/liuyuanpal/syncdreamer-env/general.

Inference

  1. Make sure you have the following models.
SyncDreamer
|-- ckpt
    |-- ViT-L-14.ckpt
    |-- syncdreamer-pretrain.ckpt
  1. (Optional) Predict foreground mask as the alpha channel. We use Paint3D to segment the foreground object interactively. We also provide a script foreground_segment.py using carvekit to predict foreground masks and you need to first crop the object region before feeding it to foreground_segment.py. We may double check the predicted masks are correct or not.
python foreground_segment.py --input <image-file-to-input> --output <image-file-in-png-format-to-output>
  1. Run SyncDreamer to produce multiview-consistent images.
python generate.py --ckpt ckpt/syncdreamer-pretrain.ckpt \
                   --input testset/aircraft.png \
                   --output output/aircraft \
                   --sample_num 4 \
                   --cfg_scale 2.0 \
                   --elevation 30 \
                   --crop_size 200

Explanation:

  1. Run a NeuS or a NeRF for 3D reconstruction.
# train a neus
python train_renderer.py -i output/aircraft/0.png \
                         -n aircraft-neus \
                         -b configs/neus.yaml \
                         -l output/renderer 
# train a nerf
python train_renderer.py -i output/aircraft/0.png \
                         -n aircraft-nerf \
                         -b configs/nerf.yaml \
                         -l output/renderer

Explanation:

Preparation for training

  1. Generate renderings for training. We provide several objaverse 3D models as examples here. The whole objaverse dataset can be downloaded at Objaverse. To unzip the random dataset, we need to cat z01 zip > zip and then unzip the output file according to the description here
# generate renderings for fixed target views
blender --background --python blender_script.py -- \
  --object_path objaverse_examples/6f99fb8c2f1a4252b986ed5a765e1db9/6f99fb8c2f1a4252b986ed5a765e1db9.glb \
  --output_dir ./training_examples/target --camera_type fixed
  
# generate renderings for random input views
blender --background --python blender_script.py -- \
  --object_path objaverse_examples/6f99fb8c2f1a4252b986ed5a765e1db9/6f99fb8c2f1a4252b986ed5a765e1db9.glb \
  --output_dir ./training_examples/input --camera_type random
  1. Organize the renderings like the following. We provide rendering examples here.
SyncDreamer
|-- training_examples
    |-- target
        |-- <renderings-of-uid-0>
        |-- <renderings-of-uid-1>
        |-- ...
    |-- input
        |-- <renderings-of-uid-0>
        |-- <renderings-of-uid-1>
        |-- ...
    |-- uid_set.pkl # this is a .pkl file containing a list of uids. Refer to `render_batch.py` for how I generate these files.
  1. Download the pretrained zero123-xl model here.
  2. The whole training set for SyncDreamer is here.

Training

python train_syncdreamer.py -b configs/syncdreamer-train.yaml \
                           --finetune_from <path-to-your-zero123-xl-model> \
                           -l <logging-directory>  \
                           -c <checkpoint-directory> \
                           --gpus 0,1,2,3,4,5,6,7

Note in configs/syncdreamer-train.yaml, we specify the following directories which contain the training data and the validation data.

target_dir: training_examples/target
input_dir: training_examples/input
uid_set_pkl: training_examples/uid_set.pkl
validation_dir: validation_set

During training, we will run validation to output images to <log_dir>/<images>/val every 1k steps.

Evaluation

GT meshes and renderings for the GSO dataset can be found at here.

  1. Evaluate COLMAP reconstruction:
python eval_colmap.py --dir eval_examples/chicken-pr --project eval_examples/chicken-project --name chicken --colmap <path-to-your-colmap>

Note the 16 views are relatively very sparse for COLMAP so it sometimes fails to reconstruct. 2. Evaluate novel view synthesis, pip install lpips and

python eval_nvs.py --gt eval_examples/chicken-gt --pr eval_examples/chicken-pr 
  1. Evaluate the mesh quality: install pip install mesh2sdf and install nvdiffrast here. Then,
python eval_mesh.py --pr_mesh eval_examples/chicken-pr.ply --pr_name syncdreamer --gt_dir eval_examples/chicken-gt --gt_mesh eval_examples/chicken-mesh/meshes/model.obj --gt_name chicken

Note we manually rotate the example when rendering. The rotations are listed in get_gt_rotate_angle in eval_mesh.py.

Acknowledgement

We have intensively borrow codes from the following repositories. Many thanks to the authors for sharing their codes.

Citation

If you find this repository useful in your project, please cite the following work. :)

@article{liu2023syncdreamer,
  title={SyncDreamer: Generating Multiview-consistent Images from a Single-view Image},
  author={Liu, Yuan and Lin, Cheng and Zeng, Zijiao and Long, Xiaoxiao and Liu, Lingjie and Komura, Taku and Wang, Wenping},
  journal={arXiv preprint arXiv:2309.03453},
  year={2023}
}