Home

Awesome

GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting

License arXiv GitHub Repo stars

[paper][project page][code]

Xinjie Zhang*, Xingtong Ge*, Tongda Xu, Dailan He, Yan Wang, Hongwei Qin, Guo Lu, Jing GengπŸ“§, Jun ZhangπŸ“§

(* denotes equal contribution, πŸ“§ denotes corresponding author.)

This is the official implementation of our paper GaussianImage, a groundbreaking paradigm of image representation and compression by 2D Gaussian Splatting. With compact 2D Gaussian representation and a novel rasterization method, our approach achieves high representation performance with short training duration, minimal GPU memory overhead and ultra-fast rendering speed. Furthermore, we integrate existing vector quantization technique to build an low-complexity neural image codec. Remarkably, the decoding speed of our codec reaches around 2000 FPS, outpacing traditional codecs like JPEG, while also providing enhanced compression performance at lower bitrates. This establishes a significant advancement in the field of neural image codecs. More qualitative results can be found in our paper.

<div align="center"> <img src="./img/kodak_representation.png" alt="kodak_fitting" width="320" /> <img src="./img/div2k_representation.png" alt="div2k_fitting" width="320" /> </div> <div align="center"> <img src="./img/kodak_codec.png" alt="kodak_codec" width="320" /> <img src="./img/div2k_codec.png" alt="div2k_codec" width="320" /> </div> <div align=center> <img src="./img/visual.png" alt="visual" width="640" /> </div>

News

Overview

overview

Implicit neural representations (INRs) recently achieved great success in image representation and compression, offering high visual quality and fast rendering speeds with 10-1000 FPS, assuming sufficient GPU resources are available. However, this requirement often hinders their use on low-end devices with limited memory. In response, we propose a groundbreaking paradigm of image representation and compression by 2D Gaussian Splatting, named GaussianImage. We first introduce 2D Gaussian to represent the image, where each Gaussian has 8 parameters including position, covariance and color. Subsequently, we unveil a novel rendering algorithm based on accumulated summation. Remarkably, our method with a minimum of $3\times$ lower GPU memory usage and $5\times$ faster fitting time not only rivals INRs (e.g., WIRE, I-NGP) in representation performance, but also delivers a faster rendering speed of 1500-2000 FPS regardless of parameter size. Furthermore, we integrate existing vector quantization technique to build an image codec. Experimental results demonstrate that our codec attains rate-distortion performance comparable to compression-based INRs such as COIN and COIN++, while facilitating decoding speeds of approximately 2000 FPS. Additionally, preliminary proof of concept shows that our codec surpasses COIN and COIN++ in performance when using partial bits-back coding.

Quick Started

Cloning the Repository

The repository contains submodules, thus please check it out with

# SSH
git clone git@github.com:Xinjie-Q/GaussianImage.git --recursive

or

# HTTPS
git clone https://github.com/Xinjie-Q/GaussianImage.git --recursive

After cloning the repository, you can follow these steps to train GaussianImage models under different tasks.

Requirements

cd gsplat
pip install .[dev]
cd ../
pip install -r requirements.txt

If you encounter errors while installing the packages listed in requirements.txt, you can try installing each Python package individually using the pip command.

Before training, you need to download the kodak and DIV2K-validation datasets. The dataset folder is organized as follows.

β”œβ”€β”€ dataset
β”‚   | kodak 
β”‚     β”œβ”€β”€ kodim01.png
β”‚     β”œβ”€β”€ kodim02.png 
β”‚     β”œβ”€β”€ ...
β”‚   | DIV2K_valid_LR_bicubic
β”‚     β”œβ”€β”€ X2
β”‚        β”œβ”€β”€ 0801x2.png
β”‚        β”œβ”€β”€ 0802x2.png
β”‚        β”œβ”€β”€ ...

Representation

sh ./scripts/gaussianimage_cholesky/kodak.sh /path/to/your/dataset
sh ./scripts/gaussianimage_rs/kodak.sh /path/to/your/dataset
sh ./scripts/3dgs/kodak.sh /path/to/your/dataset

sh ./scripts/gaussianimage_cholesky/div2k.sh /path/to/your/dataset
sh ./scripts/gaussianimage_rs/div2k.sh /path/to/your/dataset
sh ./scripts/3dgs/div2k.sh /path/to/your/dataset

Compression

After overfitting the image, we load the checkpoints from image representation and apply quantization-aware training technique to obtain the image compression results of GaussianImage models.

sh ./scripts/gaussianimage_cholesky/kodak_comp.sh /path/to/your/dataset
sh ./scripts/gaussianimage_rs/kodak_comp.sh /path/to/your/dataset

sh ./scripts/gaussianimage_cholesky/div2k_comp.sh /path/to/your/dataset
sh ./scripts/gaussianimage_rs/div2k_comp.sh /path/to/your/dataset

Acknowledgments

Our code was developed based on gsplat. This is a concise and easily extensible Gaussian Splatting library.

We thank vector-quantize-pytorch for providing the framework to implement residual vector quantization.

Citation

If you find our GaussianImage paradigm useful or relevant to your research, please kindly cite our paper:

@inproceedings{zhang2024gaussianimage,
  title={GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting},
  author={Zhang, Xinjie and Ge, Xingtong and Xu, Tongda and He, Dailan and Wang, Yan and Qin, Hongwei and Lu, Guo and Geng, Jing and Zhang, Jun},
  booktitle={European Conference on Computer Vision},
  year={2024}
}