

Learned Focused Plenoptic Image Compression with Microimage Preprocessing and Global Attention

Pytorch implementation of the paper "Learned Focused Plenoptic Image Compression with Microimage Preprocessing and Global Attention". IEEE Transactions on Multimedia 2023. This repository is based on CompressAI. We kept scripts for training and evaluation, and removed other components. For the official code release, see the CompressAI.

Related links

Table of Contents


This repo defines the focused plenoptic image dataset “FPI2k" and the global-attention-based models for learned focused plenoptic image compression in "Learned Focused Plenoptic Image Compression with Microimage Preprocessing and Global Attention".


FPI2k original imagesFPI2k original images
Packaged FPI2k original imagesPackaged FPI2k original images
FPI2k preprocessed imagesFPI2k preprocessed images
Packaged FPI2k preprocessed imagesPackaged FPI2k preprocessed images
TSPC white imageTSPC white image
Training patchesTraining patches
Packaged training patchesPackaged training patches
Full-resolution test imagesFull-resolution test images
Model checkpointsModel checkpoints


“FPI2k” is a focused plenoptic image dataset with 1910 images captured from real scenes indoor and outdoor with object depth variations. From a single plenoptic image, 5×5 sub-aperture images can be generated with much larger disparities one from the others. 1910 focused plenoptic images are captured and manually annotated to 32 categories based on their contents

The original focused plenoptic images are available to download.(https://pan.baidu.com/s/1CQ9hKhxY1z-sIHsqr00XXg?pwd=xya8, access code:xya8)

Data preprocessing

Based on the observations that inter-microimage pixels, boundary incomplete microimages, and vignetting pixels in the microimages are ineffective in light field applications, like refocusing, multi-view rendering, etc., a sub-aperture images the lossless preprocessing scheme is proposed to reshape the sub-aperture effective pixels in each microimage and align the cropped microimages to the rectangular grid to be compatible with patch-based training and to reduce pixel redundancy.

The preprocessed focused plenoptic images are available to download.(https://pan.baidu.com/s/1omfnFkK_XQpBrJyE6epkXQ?pwd=2hw0)


The directory Rendering provided the rendering code to render sub-aperture images from original or preprocessed focused plenoptic images

Download the TSPC white image coefficients in ./Rendering/FocusedPlenopticRendering for devigneting.

Run the ./Rendering/Original2SAI.m to render the sub-aperture images from original focused plenoptic images.

Run the ./Rendering/Original2Preprocessed.m to preprocess the original focused plenoptic images to preprocess focused plenoptic images.

Run the ./Rendering/Preprocessed2SAI.m to render the sub-aperture images from preprocessed focused plenoptic images.

Global Attention Compression Network (GACN)


Install CompressAI and the packages required for development.

conda create -n FPIcompress python=3.9
conda activate FPIcompress
pip install compressai==1.1.5
pip install ptflops
pip install einops
pip install tensorboardX

Note: wheels are available for Linux and MacOS.


Traing dataset

The 75080 patches of preprocessed plenoptic images are available. The full-resolution test images are available.

Available model


An exemplary training script with a rate-distortion loss is provided in train.py.

Training a model:

python train.py -d ./dataset --model Proposed -e 50  -lr 1e-4 -n 8  --lambda 1e-1 --batch-size 4  --test-batch-size 4 --aux-learning-rate 1e-4 --patch-size 384 384 --cuda --save --seed 1926 --clip_max_norm 1.0 --gpu-id 1 --savepath  ./checkpoint/PLConvTrans01


To evaluate a trained model, the evaluation script is:

python Inference.py --dataset /path/to/image/folder/ --output_path /path/to/reconstruction/folder/ -m Proposed -p ./updatedCheckpoint/PLConvTrans01.pth.tar --patch 384

More details can refer to Inference.sh

Run the ./Rendering/SubapertureImageDistortion.m to rendering the sub-aperture images from reconstructed focused plenoptic images and calculate the distortion.

Pretrained Models

Pretrained models (optimized for MSE) trained from focused plenoptic image patches are available.


All the checkpoints are available at:https://pan.baidu.com/s/1hsFpQic6bMRZFvcmbAN7-g?pwd=54rv, access code:54rv



The distortion and bpp are defined to ensure comparison fairness of different compression input formats for original-plenoptic-image compression, preprocessed-plenoptic-image compression, and sub-aperture-image compression.

The distortion between the reference sub-aperture images and the reconstructed sub-aperture images is measured by the average of PSNR or MS-SSIM of each sub-aperture image.

The bitrate is defined by bit-per-pixel (bpp) which is calculated via dividing the number of consumed bits by the total number of pixels of the original plenoptic image which is 4080x3068.



Visualization of the reconstructed central sub-aperture image of "Car".

RD curves

Visualization of the reconstructed image Car. RD curves on I01 "Cars" original_rdI01 preprocessing_rdI01 reordering_rdI01

RD curves on I15 "Matryoshka_170" original_rdI15 preprocessing_rdI15 reordering_rdI15

RD curves on 20 test Images AverageOriginal_rd AveragePreprocessing_rd AverageReordering_rd


If you have any problem, please contact me: tkd20@mails.tsinghua.edu.cn

If you think it is useful for your reseach, please cite our TMM2023 paper.

  author={Tong, Kedeng and Jin, Xin and Yang, Yuqing and Wang, Chen and Kang, Jinshi and Jiang, Fan},
  journal={IEEE Transactions on Multimedia}, 
  title={Learned Focused Plenoptic Image Compression with Microimage Preprocessing and Global Attention}, 