Home

Awesome

ColDBin: Cold Diffusion for Document Image Binarization

This repository contains the datasets and code for the paper ColDBin: Cold Diffusion for Document Image Binarization by Saifullah Saifullah, Stefan Agne, Andreas Dengel, and Sheraz Ahmed.

Requires Python 3+. For evaluation, please download the data from the links below.

Approach:

<img align="center" src="assets/approach.png">

Qualitative Results:

<img align="center" src="assets/qualitative.png">

Quantitative Results

DatasetFMp-FMPSNRDRD
DIBCO 200994.1996.5220.652.58
DIBCO 201095.2996.6722.061.36
DIBCO 201195.2396.9321.531.44
DIBCO 201296.3797.4123.401.28
DIBCO 201396.6297.1523.981.20
DIBCO 201497.8998.1024.380.66
DIBCO 201689.5093.7318.713.84
DIBCO 201793.0495.1219.322.29
DIBCO 201889.7193.0019.533.82

Prepare dibco datasets

Download the datasets from the link: Use the example dataset preparation script provided for DIBCO 2013 dataset:

./scripts/prepare_dataset.sh

Train

Train a diffusion model in cold manner using the example training script for DIBCO 2013 dataset:

./scripts/train.sh

Test:

Test the trained model using the example testing script for DIBCO 2013 dataset:

./scripts/test.sh
<!-- # Citation If you find this useful in your research, please consider citing: ``` @INPROCEEDINGS{9956167, author={Saifullah, Saifullah and Agne, Stefan and Dengel, Andreas and Ahmed, Sheraz}, booktitle={2022 26th International Conference on Pattern Recognition (ICPR)}, title={Are Deep Models Robust against Real Distortions? A Case Study on Document Image Classification}, year={2022}, volume={}, number={}, pages={1628-1635}, doi={10.1109/ICPR56361.2022.9956167}} ``` -->

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.