Home

Awesome

Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

CVPR 2024 (Oral, Best Paper Award Candidate)

This repository represents the official implementation of the paper titled "Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation".

Website Paper Hugging Face (LCM) Space Hugging Face (LCM) Model Open In Colab License

<!-- [![Hugging Face Model](https://img.shields.io/badge/๐Ÿค—%20Hugging%20Face-Model-green)](https://huggingface.co/prs-eth/marigold-v1-0) --> <!-- [![Website](https://img.shields.io/badge/Project-Website-1081c2)](https://arxiv.org/abs/2312.02145) --> <!-- [![GitHub](https://img.shields.io/github/stars/prs-eth/Marigold?style=default&label=GitHub%20โ˜…&logo=github)](https://github.com/prs-eth/Marigold) --> <!-- [![HF Space](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Space-blue)]() --> <!-- [![Docker](doc/badges/badge-docker.svg)]() -->

Bingxin Ke, Anton Obukhov, Shengyu Huang, Nando Metzger, Rodrigo Caye Daudt, Konrad Schindler

We present Marigold, a diffusion model, and associated fine-tuning protocol for monocular depth estimation. Its core principle is to leverage the rich visual knowledge stored in modern generative image models. Our model, derived from Stable Diffusion and fine-tuned with synthetic data, can zero-shot transfer to unseen data, offering state-of-the-art monocular depth estimation results.

teaser

๐Ÿ“ข News

2024-05-28: Training code is released.<br> 2024-03-23: Added LCM v1.0 for faster inference - try it out at <a href="https://huggingface.co/spaces/prs-eth/marigold-lcm"><img src="https://img.shields.io/badge/๐Ÿค—%20Hugging%20Face%20(LCM)-Space-yellow" height="16"></a><br> 2024-03-04: Accepted to CVPR 2024. <br> 2023-12-22: Contributed to Diffusers community pipeline. <br> 2023-12-19: Updated license to Apache License, Version 2.0.<br> 2023-12-08: Added <a href="https://huggingface.co/spaces/toshas/marigold"><img src="https://img.shields.io/badge/๐Ÿค—%20Hugging%20Face-Space-yellow" height="16"></a> - try it out with your images for free!<br> 2023-12-05: Added <a href="https://colab.research.google.com/drive/12G8reD13DdpMie5ZQlaFNo2WCGeNUH-u?usp=sharing"><img src="doc/badges/badge-colab.svg" height="16"></a> - dive deeper into our inference pipeline!<br> 2023-12-04: Added <a href="https://arxiv.org/abs/2312.02145"><img src="https://img.shields.io/badge/arXiv-PDF-b31b1b" height="16"></a> paper and inference code (this repository).

๐Ÿš€ Usage

We offer several ways to interact with Marigold:

  1. We integrated Marigold Pipelines into diffusers ๐Ÿงจ. Check out many exciting usage scenarios in this diffusers tutorial.

  2. A free online interactive demo is available here: <a href="https://huggingface.co/spaces/prs-eth/marigold-lcm"><img src="https://img.shields.io/badge/๐Ÿค—%20Hugging%20Face%20(LCM)-Space-yellow" height="16"></a> (kudos to the HF team for the GPU grant)

  3. Run the demo locally (requires a GPU and an nvidia-docker2, see Installation Guide):

    1. Paper version: docker run -it -p 7860:7860 --platform=linux/amd64 --gpus all registry.hf.space/toshas-marigold:latest python app.py
    2. LCM version: docker run -it -p 7860:7860 --platform=linux/amd64 --gpus all registry.hf.space/prs-eth-marigold-lcm:latest python app.py
  4. Extended demo on a Google Colab: <a href="https://colab.research.google.com/drive/12G8reD13DdpMie5ZQlaFNo2WCGeNUH-u?usp=sharing"><img src="doc/badges/badge-colab.svg" height="16"></a>

  5. If you just want to see the examples, visit our gallery: <a href="https://marigoldmonodepth.github.io"><img src="doc/badges/badge-website.svg" height="16"></a>

  6. Finally, local development instructions with this codebase are given below.

๐Ÿ› ๏ธ Setup

The inference code was tested on:

๐Ÿชง A Note for Windows users

We recommend running the code in WSL2:

  1. Install WSL following installation guide.
  2. Install CUDA support for WSL following installation guide.
  3. Find your drives in /mnt/<drive letter>/; check WSL FAQ for more details. Navigate to the working directory of choice.

๐Ÿ“ฆ Repository

Clone the repository (requires git):

git clone https://github.com/prs-eth/Marigold.git
cd Marigold

๐Ÿ’ป Dependencies

We provide several ways to install the dependencies.

  1. Using Mamba, which can installed together with Miniforge3.

    Windows users: Install the Linux version into the WSL.

    After the installation, Miniforge needs to be activated first: source /home/$USER/miniforge3/bin/activate.

    Create the environment and install dependencies into it:

    mamba env create -n marigold --file environment.yaml
    conda activate marigold
    
  2. Using pip: Alternatively, create a Python native virtual environment and install dependencies into it:

    python -m venv venv/marigold
    source venv/marigold/bin/activate
    pip install -r requirements.txt
    

Keep the environment activated before running the inference script. Activate the environment again after restarting the terminal session.

๐Ÿƒ Testing on your images

๐Ÿ“ท Prepare images

  1. Use selected images from our paper:

    bash script/download_sample_data.sh
    
  2. Or place your images in a directory, for example, under input/in-the-wild_example, and run the following inference command.

๐Ÿš€ Run inference with LCM (faster)

The LCM checkpoint is distilled from our original checkpoint towards faster inference speed (by reducing inference steps). The inference steps can be as few as 1 (default) to 4. Run with default LCM setting:

 python run.py \
     --input_rgb_dir input/in-the-wild_example \
     --output_dir output/in-the-wild_example_lcm

๐ŸŽฎ Run inference with DDIM (paper setting)

This setting corresponds to our paper. For academic comparison, please run with this setting.

python run.py \
    --checkpoint prs-eth/marigold-v1-0 \
    --denoise_steps 50 \
    --ensemble_size 10 \
    --input_rgb_dir input/in-the-wild_example \
    --output_dir output/in-the-wild_example

You can find all results in output/in-the-wild_example. Enjoy!

โš™๏ธ Inference settings

The default settings are optimized for the best result. However, the behavior of the code can be customized:

โฌ‡ Checkpoint cache

By default, the checkpoint is stored in the Hugging Face cache. The HF_HOME environment variable defines its location and can be overridden, e.g.:

export HF_HOME=$(pwd)/cache

Alternatively, use the following script to download the checkpoint weights locally:

bash script/download_weights.sh marigold-v1-0
# or LCM checkpoint
bash script/download_weights.sh marigold-lcm-v1-0

At inference, specify the checkpoint path:

python run.py \
    --checkpoint checkpoint/marigold-v1-0 \
    --denoise_steps 50 \
    --ensemble_size 10 \
    --input_rgb_dir input/in-the-wild_example\
    --output_dir output/in-the-wild_example

๐Ÿฆฟ Evaluation on test datasets <a name="evaluation"></a>

Install additional dependencies:

pip install -r requirements+.txt -r requirements.txt

Set data directory variable (also needed in evaluation scripts) and download evaluation datasets into corresponding subfolders:

export BASE_DATA_DIR=<YOUR_DATA_DIR>  # Set target data directory

wget -r -np -nH --cut-dirs=4 -R "index.html*" -P ${BASE_DATA_DIR} https://share.phys.ethz.ch/~pf/bingkedata/marigold/evaluation_dataset/

Run inference and evaluation scripts, for example:

# Run inference
bash script/eval/11_infer_nyu.sh

# Evaluate predictions
bash script/eval/12_eval_nyu.sh

Note: although the seed has been set, the results might still be slightly different on different hardware.

๐Ÿ‹๏ธ Training

Based on the previously created environment, install extended requirements:

pip install -r requirements++.txt -r requirements+.txt -r requirements.txt

Set environment parameters for the data directory:

export BASE_DATA_DIR=YOUR_DATA_DIR  # directory of training data
export BASE_CKPT_DIR=YOUR_CHECKPOINT_DIR  # directory of pretrained checkpoint

Download Stable Diffusion v2 checkpoint into ${BASE_CKPT_DIR}

Prepare for Hypersim and Virtual KITTI 2 datasets and save into ${BASE_DATA_DIR}. Please refer to this README for Hypersim preprocessing.

Run training script

python train.py --config config/train_marigold.yaml

Resume from a checkpoint, e.g.

python train.py --resume_run output/marigold_base/checkpoint/latest

Evaluating results

Only the U-Net is updated and saved during training. To use the inference pipeline with your training result, replace unet folder in Marigold checkpoints with that in the checkpoint output folder. Then refer to this section for evaluation.

Note: Although random seeds have been set, the training result might be slightly different on different hardwares. It's recommended to train without interruption.

โœ๏ธ Contributing

Please refer to this instruction.

๐Ÿค” Troubleshooting

ProblemSolution
(Windows) Invalid DOS bash script on WSLRun dos2unix <script_name> to convert script format
(Windows) error on WSL: Could not load library libcudnn_cnn_infer.so.8. Error: libcuda.so: cannot open shared object file: No such file or directoryRun export LD_LIBRARY_PATH=/usr/lib/wsl/lib:$LD_LIBRARY_PATH

๐ŸŽ“ Citation

Please cite our paper:

@InProceedings{ke2023repurposing,
      title={Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation},
      author={Bingxin Ke and Anton Obukhov and Shengyu Huang and Nando Metzger and Rodrigo Caye Daudt and Konrad Schindler},
      booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      year={2024}
}

๐ŸŽซ License

This work is licensed under the Apache License, Version 2.0 (as defined in the LICENSE).

By downloading and using the code and model you agree to the terms in the LICENSE.

License