Awesome
<div align="center"> <h3>[CVPR2024] 360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model</h3>Qian Wang, Weiqi Li, Chong Mou, Xinhua Cheng, Jian Zhang
School of Electronic and Computer Engineering, Peking University
This repository is the official implementation of 360DVD, a panorama video generation pipeline based on the given prompts and motion conditions. The main idea is to turn a T2V model into a panoramic T2V model through 360-Adapter and 360 Enhancement Techniques.
</div>Gallery
We have showcased some regular videos generated by AnimateDiff and panoramic videos generated by 360DVD below.
More results can be found on our Project Page.
<table> <tr> <td><img src="__assets__/videos/1.gif" alt="AnimateDiff"></td> <td><img src="__assets__/videos/1_1.gif" alt="Ours"></td> <td><img src="__assets__/videos/2.gif" alt="AnimateDiff"></td> <td><img src="__assets__/videos/2_1.gif" alt="Ours"></td> </tr> <tr> <td colspan="2"><center>"the top of a snow covered mountain range, with the sun shining over it"</center></td> <td colspan="2"><center>"a view of fireworks exploding in the night sky over a city, as seen from a plane"</center></td> </tr> <tr> <td><img src="__assets__/videos/3.gif" alt="AnimateDiff"></td> <td><img src="__assets__/videos/3_1.gif" alt="Ours"></td> <td><img src="__assets__/videos/4.gif" alt="AnimateDiff"></td> <td><img src="__assets__/videos/4_1.gif" alt="Ours"></td> </tr> <tr> <td colspan="2"><center>"a desert with sand dunes, blue cloudy sky"</center></td> <td colspan="2"><center>"the city under cloudy sky, a car driving down the street with buildings"</center></td> </tr> <tr> <td><img src="__assets__/videos/5.gif" alt="AnimateDiff"></td> <td><img src="__assets__/videos/5_1.gif" alt="Ours"></td> <td><img src="__assets__/videos/6.gif" alt="AnimateDiff"></td> <td><img src="__assets__/videos/6_1.gif" alt="Ours"></td> </tr> <tr> <td colspan="2"><center>"a large mountain lake, the lake surrounded by hills and mountains"</center></td> <td colspan="2"><center>"a volcano with smoke coming out, mountains under clouds, at sunset"</center></td> </tr> </table>Model: Realistic Vision V5.1
To Do List
- Release gradio demo
- Release weights
- Release code
- Release paper
- Release dataset
Steps for Inference
Prepare Environment
git clone https://github.com/Akaneqwq/360DVD.git
cd 360DVD
conda env create -f environment.yaml
conda activate 360dvd
Download Pretrained Models
git lfs install
mkdir -p ckpts/StableDiffusion/
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 ckpts/StableDiffusion/stable-diffusion-v1-5/
bash download_bashscripts/0-MotionModule.sh
bash download_bashscripts/1-360Adapter.sh
bash download_bashscripts/2-RealisticVision.sh
Generate Panorama Videos
python -m scripts.animate --config configs/prompts/0-realisticVision.yaml
You can write your own config, then update the path and run it again. We strongly recommend using a personalized T2I model, such as Realistic Vision or Lyriel, for a better performance.
Steps for Training
Prepare Dataset
You can directly download WEB360 Dataset.
bash download_bashscripts/4-WEB360.sh
unzip /datasets/WEB360.zip -d /datasets
Or prepare your own dataset consists of panoramic video clips.
You can use single BLIP to caption your videos. For more fine-grained results, modify the code provided in dvd360/utils/erp2pers.py
and dvd360/utils/360TextFusion.py
to execute the 360 Text Fusion process.
Extract Motion Information
Download the pretrained model PanoFlow(RAFT)-wo-CFE.pth
of Panoflow at weiyun, then put it in PanoFlowAPI/ckpt/
folder and rename it to PanoFlow-RAFT-wo-CFE.pth
.
Update scripts/video2flow.py
.
gpus_list = [Replace with available GPUs]
train_video_dir = [Replace with the folder path of panoramic videos]
flow_train_video_dir = [Replace with the folder path you want to save flow videos]
Then you can run the below command to obtain corresponding flow videos.
python -m scripts.video2flow
Configuration
Update data paths in the config .yaml
files in configs/training/
folder.
train_data:
csv_path: [Replace with .csv Annotation File Path]
video_folder: [Replace with Video Folder Path]
flow_folder: [Replace with Flow Folder Path]
Other training parameters (lr, epochs, validation settings, etc.) are also included in the config files.
Training
CUDA_VISIBLE_DEVICES=0 torchrun --nnodes=1 --nproc_per_node=1 train.py --config configs/training/training.yaml
Contact Us
Qian Wang: qianwang@stu.pku.edu.cn
Acknowledgements
Codebase built upon AnimateDiff, T2I-Adapter and Panoflow.
BibTeX
@article{wang2024360dvd,
title={360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model},
author={Qian Wang and Weiqi Li and Chong Mou and Xinhua Cheng and Jian Zhang},
journal={arXiv preprint arXiv:2401.06578},
year={2024}
}