Home

Awesome

<h1 align="center"> VGPNN: <br> Diverse Generation from a Single Video Made Possible </h1> <div align="center"><font size="+1"><em>Accepted to ECCV 2022</em></font></div> <div align="center"> <a href=https://nivha.github.io/vgpnn> Project </a> | <a href=https://arxiv.org/abs/2109.08591> Arxiv </a></div>

Pytorch implementation of the paper: "Diverse Generation from a Single Video Made Possible"

Code

Data

You can download videos from this Dropbox Videos Folder into ./data folder.

Note that a video is represented as a directory with PNG files in the format <frame number>.png

For example:

some/path/my_video/
   1.png
   2.png
   3.png
   ...

Video generation

To generate a new sample from a single video

python run_generation.py --gpu 0 --frames_dir <path to frames dir> --start_frame <number of first frame> --end_frame <number of last frame>

Examples:

python run_generation.py --frames_dir=data/airballoons_QGAMTlI6XxY --start_frame=66 --end_frame=80
python run_generation.py --frames_dir=data/airballoons_QGAMTlI6XxY --start_frame=66 --end_frame=165 --max_size=360 --sthw='(0.5,1,1)'

Video analogies

Please download raft-sintel.pth model from RAFT (or directly from here) and place it in ./raft/models/raft-sintel.pth

To compute a new video with the spatio-temporal layout of video A and the appearance of video B:

python run_analogies.py --a_frames_dir <A frames dir> --b_frames_dir <B frames dir> --a_n_bins <A: number of dynamic bins> --b_n_bins <B: number of dynamic bins> --results_dir <results dir>

For example:

python run_analogies.py --a_frames_dir data/waterfall_Qo3OM5sPUPM --b_frames_dir data/lava_m_e7jUfvt-I --a_n_bins 4 --b_n_bins 8 --results_dir results/wfll2lava

Video Retargeting

Retargeting is similar to generation but with a different aspect ratio for the output, and without adding any noise.

python run_generation.py --gpu 0 --frames_dir <path to frames dir> --start_frame <number of first frame> --end_frame <number of last frame> --use_noise False --sthw '(ST,SH,SW)'

Where (ST,SH,SW) are the required scales for the temporal, height and width dimensions, respectively. E.g., (1,1,1) will not change the result, where (1,1,0.5) will generate a retargeted result with the same height and number of frames, but with half the width of the input.

For example:

python run_generation.py --gpu 0 --frames_dir data/airballoons_QGAMTlI6XxY --start_frame 66 --end_frame 80 --max_size 360 --use_noise False --min_size '(3,40)' --kernel_size '(3,7,7)' --downfactor '(0.87,0.82)' --sthw '(1,1,0.6)'

Citation

If you find our project useful for your work please cite:

@inproceedings{haim2022diverse,
  title={Diverse generation from a single video made possible},
  author={Haim, Niv and Feinstein, Ben and Granot, Niv and Shocher, Assaf and Bagon, Shai and Dekel, Tali and Irani, Michal},
  booktitle={European Conference on Computer Vision},
  pages={491--509},
  year={2022},
  organization={Springer}
}