Awesome

SwinT-ChARM (TensorFlow 2)

This repository provides a TensorFlow implementation of SwinT-ChARM based on:

SwinT-ChARM net arch <sup> Source </sup>

Updates

10/06/2023

LIC-TCM (TensorFlow 2) is now available: https://github.com/Nikolai10/LIC-TCM (Liu et al. CVPR 2023 Highlight).

09/06/2023

The high quality of this reimplementation has been confirmed in EGIC, Section A.8..

10/09/2022

The number of model parameters now corresponds exactly to the reported number (32.6 million). We thank the authors for providing us with the official DeepSpeed log files.
SwinT-ChARM now supports compression at different input resolutions (multiples of 256).
We release a pre-trained model as proof of functional correctness.

08/17/2022

Initial release of this project (see branch release_08/17/2022)

Acknowledgment

This project is based on:

TensorFlow Compression (TFC), a TF library dedicated to data compression.
swin-transformers-tf, an unofficial implementation of Swin-Transformer. Functional correctness has been proven.

Note that this repository builds upon the official TF implementation of Minnen et al., while Zhu et al. base their work on an unknown (possibly not publicly available) PyTorch reimplementation.

Examples

The samples below are taken from the Kodak dataset, external to the training set:

Original	SwinT-ChARM (β = 0.0003)

Mean squared error: 13.7772
PSNR (dB): 36.74
Multiscale SSIM: 0.9871
Multiscale SSIM (dB): 18.88
Bits per pixel: 0.9890

Original	SwinT-ChARM (β = 0.0003)

Mean squared error: 7.1963
PSNR (dB): 39.56
Multiscale SSIM: 0.9903
Multiscale SSIM (dB): 20.13
Bits per pixel: 0.3953

Original	SwinT-ChARM (β = 0.0003)

Mean squared error: 10.1494
PSNR (dB): 38.07
Multiscale SSIM: 0.9888
Multiscale SSIM (dB): 19.49
Bits per pixel: 0.6525

More examples can be found here.

Pretrained Models/ Performance (TFC 2.8)

Our pre-trained model (β = 0.0003) achieves a PSNR of 37.59 (db) using an average of 0.93 bpp on the Kodak dataset, which is very close to the reported numbers (see paper, Figure 3). Worth mentioning: we achieve this result despite training our model from scratch and using less than one-third of the computational resources (1M optimization steps).

Lagrangian multiplier (β)	SavedModel	Training Instructions
0.0003	download	<pre lang=bash>`!python SwinT-ChARM/zyc2022.py -V --model_path <...> train --max_support_slices 10 --lambda 0.0003 --epochs 1000 --batchsize 16 --train_path <...>`</pre>

File Structure

 res
     ├── doc/                                       # addtional resources
     ├── eval/                                      # sample images + reconstructions
     ├── train_zyc2022/                             # model checkpoints + tf.summaries
     ├── zyc2022/                                   # saved model
 swin-transformers-tf/                              # extended swin-transformers-tf implementation 
     ├── changelog.txt                              # summary of changes made to the original work
     ├── ...  
 config.py                                          # model-dependent configurations
 zyc2022.py                                         # core of this repo

License

Apache License 2.0