Home

Awesome

πŸ”₯ 2024.4.28: Good news! The code and pre-trained model of DocScanner are now released!

πŸš€ Good news! The online demo for DocScanner is now live, allowing for easy image upload and correction.

πŸ”₯ Good news! Our new work DocTr++: Deep Unrestricted Document Image Rectification comes out, capable of rectifying various distorted document images in the wild.

πŸ”₯ Good news! A comprehensive list of Awesome Document Image Rectification methods is available.

DocScanner

<p> <a href='https://drive.google.com/file/d/1mmCUj90rHyuO1SmpLt361youh-07Y0sD/view?usp=share_link' target="_blank"><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a> <a href='https://docai.doctrp.top:20443/' target="_blank"><img src='https://img.shields.io/badge/Online-Demo-green'></a> </p>

This is a PyTorch/GPU re-implementation of the paper DocScanner: Robust Document Image Rectification with Progressive Learning.

image

πŸš€ Demo (Link)

Note:The model version used in the demo corresponds to "DocScanner-L" as described in the paper.

  1. Upload the distorted document image to be rectified in the left box.
  2. Click the "Submit" button.
  3. The rectified image will be displayed in the right box.
<img width="1534" alt="image" src="https://github.com/fh2019ustc/DocScanner/assets/50725551/9eca3f7d-1570-4246-a3db-0a1cf1eece2d">

Examples

image image

Training

Inference

  1. Put the pre-trained DocScanner-L to $ROOT/model_pretrained/.
  2. Put the distorted images in $ROOT/distorted/.
  3. Run the script and the rectified images are saved in $ROOT/rectified/ by default.
    python inference.py
    

Evaluation

MethodMS-SSIMLDLi-DED (Setting 1)CERED (Setting 2)CERPara. (M)
DocScanner-T0.51237.922.04501.820.1823809.460.20682.6
DocScanner-B0.51347.621.88434.110.1652671.480.17895.2
DocScanner-L0.51787.451.86390.430.1486632.340.16488.5

Citation

Please cite the related works in your publications if it helps your research:

@inproceedings{feng2021doctr,
  title={DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction},
  author={Feng, Hao and Wang, Yuechen and Zhou, Wengang and Deng, Jiajun and Li, Houqiang},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
  pages={273--281},
  year={2021}
}
@inproceedings{feng2022docgeonet,
  title={Geometric Representation Learning for Document Image Rectification},
  author={Feng, Hao and Zhou, Wengang and Deng, Jiajun and Wang, Yuechen and Li, Houqiang},
  booktitle={Proceedings of the European Conference on Computer Vision},
  year={2022}
}
@article{feng2021docscanner,
  title={DocScanner: robust document image rectification with progressive learning},
  author={Feng, Hao and Zhou, Wengang and Deng, Jiajun and Tian, Qi and Li, Houqiang},
  journal={arXiv preprint arXiv:2110.14968},
  year={2021}
}

Acknowledgement

The codes are largely based on DocUNet and DewarpNet. Thanks for their wonderful works.

Contact

For commercial usage, please contact Professor Wengang Zhou (zhwg@ustc.edu.cn) and Hao Feng (haof@mail.ustc.edu.cn).