Awesome
Stochastic Latent Residual Video Prediction (SRVP)
Official implementation of the paper Stochastic Latent Residual Video Prediction (Jean-Yves Franceschi,* Edouard Delasalles,* Mickael Chen, Sylvain Lamprier, Patrick Gallinari), accepted and presented at ICML 2020.
Article
Presentation
Preprint
Project Website
Pretrained Models
Requirements
All models were trained with Python 3.7.6 and PyTorch 1.4.0 using CUDA 10.1.
A list required Python packages is available in the requirements.txt
file.
To speed up training, we recommend to activate mixed-precision training in the options, whose performance gains were tested on the most recent Nvidia GPU architectures (starting from Volta).
We used Nvidia's Apex (v0.1) in mixed-precision mode (O1
) to produce results reported in the paper.
We also integrated PyTorch's more recent mixed-precision training package (made available in PyTorch 1.6.0), which should give similar results.
This is, however, an experimental feature and we cannot guarantee that it achieves the same results as Apex.
Datasets
Stochastic Moving MNIST
During training, this dataset is generated on the fly.
In order to generate a consistent testing set in an .npz
file, the following commands should be executed:
python -m preprocessing.mmnist.make_test_set --data_dir $DIR --seq_len 25
for the stochastic version of the dataset, or
python -m preprocessing.mmnist.make_test_set --data_dir $DIR --deterministic --seq_len 100
for the deterministic version, where $DIR
is the directory where the testing set should be saved.
KTH
To download the dataset at a given path $DIR
, execute the following command:
bash preprocessing/kth/download.sh $DIR
(see also https://github.com/edenton/svg/blob/master/data/download_kth.sh from the official implementation of SVG).
In order to respectively train and test a model on this dataset, the following commands should be run:
python preprocessing/kth/convert.py --data_dir $DIR
and
python preprocessing/kth/make_test_set.py --data_dir $DIR
Human3.6M
This dataset can be downloaded at http://vision.imar.ro/human3.6m/description.php, after obtaining access from its owners.
Videos for every subject are included in .tgz
archives. Each of these archives should be extracted in the same folder.
To preprocess the dataset in order to use it for training and testing, these videos should be processed using the following command:
python preprocessing/human/convert.py --data_dir $DIR
where $DIR
is the directory where Human3.6M videos are saved.
Finally, the testing set is created by choosing extracts from testing videos, with the following command:
python preprocessing/human/make_test_set.py --data_dir $DIR
All processed videos are saved in the same folder as the original dataset.
BAIR
To download the dataset at a given path $DIR
, execute the following command:
bash preprocessing/bair/download.sh $DIR
(see also https://github.com/edenton/svg/blob/master/data/download_bair.sh from the official implementation of SVG).
In order to respectively train and test a model on this dataset, the following command should be run:
python preprocessing/bair/convert.py --data_dir $DIR
Training
In order to launch training on multiple GPUs, launch the following command:
OMP_NUM_THREADS=$NUMWORKERS python -m torch.distributed.launch --nproc_per_node=$NBDEVICES train.py --device $DEVICE1 $DEVICE2 --seed $SEED ...
followed by the training options, where $NBDEVICES
is the number of GPUs to be used, $NUMWORKERS
is the number of processes per GPU to use for data loading (should be equal to the value given to the option n_workers
), $DEVICE1 $DEVICE2 ...
is a list of GPU indices whose length in equal to $NBDEVICES
, and $SEED
is the chosen random seed.
Training can be accelerated using options --apex_amp
or --torch_amp
(see requirements).
Data directory ($DATA_DIR
) and saving path ($SAVE_DIR
) must be given using options --data_dir $DATA_DIR --save_path $SAVE_DIR
.
Training parameters are given by the following options:
- for Stochastic Moving MNIST:
--ny 20 --nz 20 --beta_z 2 --nt_cond 5 --nt_inf 5 --dataset smmnist --nc 1 --seq_len 15
- for Deterministic Moving MNIST:
--ny 20 --nz 20 --beta_z 2 --nt_cond 5 --nt_inf 5 --dataset smmnist --deterministic --nc 1 --seq_len 15 --lr_scheduling_burnin 800000 --lr_scheduling_n_iter 100000
- for KTH:
--ny 50 --nz 50 --n_euler_steps 2 --res_gain 1.2 --archi vgg --skipco --nt_cond 10 --nt_inf 3 --obs_scale 0.2 --batch_size 100 --dataset kth --nc 1 --seq_len 20 --lr_scheduling_burnin 150000 --lr_scheduling_n_iter 50000 --val_interval 5000 --seq_len_test 30
- for Human3.6M:
--ny 50 --nz 50 --n_euler_steps 2 --res_gain 1.2 --archi vgg --skipco --nt_cond 8 --nt_inf 3 --obs_scale 0.2 --batch_size 100 --dataset human --nc 3 --seq_len 16 --lr_scheduling_burnin 325000 --lr_scheduling_n_iter 25000 --val_interval 20000 --batch_size_test 8 --seq_len_test 53
- for BAIR:
--ny 50 --nz 50 --n_euler_steps 2 --archi vgg --skipco --nt_cond 2 --nt_inf 2 --obs_scale 0.71 --batch_size 192 --dataset bair --nc 3 --seq_len 12 --lr_scheduling_burnin 1000000 --lr_scheduling_n_iter 500000
Please also refer to the help message of train.py
:
python train.py --help
which lists all options and hyperparameters to train SRVP models.
Testing
To evaluate a trained model, the script test.py
should be used as follows:
python test.py --data_dir $DATADIR --xp_dir $XPDIR --lpips_dir $LPIPSDIR
where $XPDIR
is a directory containing a checkpoint and the corresponding json
configuration file (see the pretrained models for an example), $DATADIR
is the directory containing the test set, and $LPIPSDIR
is a directory where v0.1 LPIPS weights (from the official repository of The Unreasonable Effectiveness of Deep Features as a Perceptual Metric) are downloaded.
To run the evaluation on GPU, use the option --device $DEVICE
.
Model file name can be specified using the option --model_name $MODEL_NAME
(for instance, to load best models selected on the evaluation sets of KTH and Human3.6M: --model_name model_best.pt
).
PSNR, SSIM and LPIPS results reported in the paper were obtained with the following options:
- for stochastic Moving MNIST:
python test.py --data_dir $DATADIR --xp_dir $XPDIR --lpips_dir $LPIPSDIR --nt_gen 25
- for deterministic Moving MNIST:
python test.py --data_dir $DATADIR --xp_dir $XPDIR --lpips_dir $LPIPSDIR --n_samples 1 --nt_gen 100
- for KTH:
python test.py --data_dir $DATADIR --xp_dir $XPDIR --lpips_dir $LPIPSDIR --nt_gen 40
- for Human3.6M:
python test.py --data_dir $DATADIR --xp_dir $XPDIR --lpips_dir $LPIPSDIR --nt_gen 53
- for BAIR:
python test.py --data_dir $DATADIR --xp_dir $XPDIR --lpips_dir $LPIPSDIR --nt_gen 30
Adding option --fvd
additionally computes FVD.
Please also refer to the help message of test.py
:
python test.py --help
Troubleshooting
It has been reported that using Apex mixed-precision training in specific configurations may lead to an excessive RAM usage due to this memory leak issue in Apex. We refer to the links hereinabove for solutions to this problem.
Please feel free to create an issue for any other problem that you might encounter using our code.