Awesome
Deep Learning in Latent Space for Video Prediction and Compression
Codes for Deep Learning in Latent Space for Video Prediction and Compression(CVPR 2021), a latent prediction based video compression algorithm.
Introduction and Framework
The proposed latent domain compression of individual frames is obtained by an auto-encoder DNN trained with a generative adversarial network (GAN) framework. To exploit the temporal correlation within the video frame sequence, we employ a convolutional long short-term memory (ConvLSTM) network to predict the latent vector representation of the future frame.
Architecture
The detailed neural network structure of our predictor model. The detailed neural network structure of our decoder model.
Datasets
In order to use the datasets used in the paper, please download the UVG dataset, the Kinetics dataset, the VTL dataset, and the UVG dataset.
- The UVG and Kinetics dataset are used for training the prediction network.
- The VTL and UVG datasets are implemented for testing the performance.
- Note that we use the learning based image compression algorithm (Liu et al) as the intra compression for one single frame.
- The compressed latents are the input for the prediction network.
ADMM quantization
To further reduce the bitrate of the compressed video, we applied ADMM quantization for the residual from latent prediction incorporated in the proposed video compression framework.
Arithmetic Coding
To use the entropy coding method in this paper, download the general code library in python with arithmetic coding.
Test pretrained model
To tested the result without ADMM quantization,
$ python test.py
To test the result with ADMM quantization
$ python Compression_ADMM.py
Citation
Please cite our paper if you find our paper useful for your research.