Home

Awesome

Deep Learning in Latent Space for Video Prediction and Compression

Codes for Deep Learning in Latent Space for Video Prediction and Compression(CVPR 2021), a latent prediction based video compression algorithm.

Introduction and Framework

The proposed latent domain compression of individual frames is obtained by an auto-encoder DNN trained with a generative adversarial network (GAN) framework. To exploit the temporal correlation within the video frame sequence, we employ a convolutional long short-term memory (ConvLSTM) network to predict the latent vector representation of the future frame.

Flow chart

Architecture

The detailed neural network structure of our predictor model. Predictor architecture The detailed neural network structure of our decoder model. Decoder architecture

Datasets

In order to use the datasets used in the paper, please download the UVG dataset, the Kinetics dataset, the VTL dataset, and the UVG dataset.

ADMM quantization

To further reduce the bitrate of the compressed video, we applied ADMM quantization for the residual from latent prediction incorporated in the proposed video compression framework.

Arithmetic Coding

To use the entropy coding method in this paper, download the general code library in python with arithmetic coding.

Test pretrained model

To tested the result without ADMM quantization,

$ python test.py

To test the result with ADMM quantization

$ python Compression_ADMM.py

Citation

Please cite our paper if you find our paper useful for your research.