Home

Awesome

This is a simple text image super-resolution package.

PyPI Citation

This is a simple baseline (ESRGAN) trained using synthetic data from our CVPR paper MARCONet. This model is trained on Chinese and English Characters. When the degradation is not severe, it may also perform well in other languages, like Japanese.

<img src="GitImgs/Compare/r8.png" width="790px"/> <img src="GitImgs/Compare/r6.png" width="790px"/>

This package can post-process the text region with a simple command, i.e.,

textbsr -i [LR_TEXT_PATH] -b [BACKGROUND_SR_PATH]

Dependencies and Installation

# Install with pip
pip install textbsr

Basic Usage

# On the terminal command
textbsr -i [LR_TEXT_PATH]

or

# On the python environment
from textbsr import textbsr
textbsr.bsr(input_path='./testsets/LQs')

Parameter details:

parameter namedefaultdescription
-i, --input_pathThe LR text image path. It can be full images or text layouts only.
-b, --bg_pathNoneThe background SR path from any BSR methods (e.g., BSRGAN, Real-ESRGAN, StableSR). If None, we only restore the text region detected by cnstd.
-o, --output_pathNoneThe save path for text sr result. If None, we save the results on the same path with the format of [input_path]_TIMESTAMP.
-a, --alignedFalseaction='store_true'. If True, the input text image contains only one-line text region. If False, we use cnstd to detect text regions and then restore them.
-s, --save_textFalseaction='store_true'. If True, save the LR and SR text layout.
-d, --deviceNoneDevice, use 'gpu' or 'cpu'. If None, we use torch.cuda.is_available to select the device.

(1) Text Region Restoration

# On the terminal command
textbsr -i [LR_TEXT_PATH]

or

# On the python environment
from textbsr import textbsr
textbsr.bsr(input_path='./testsets/LQs', save_text=True)

<img src="GitImgs/Compare/r9.png" height="395px"/> <img src="GitImgs/Compare/r2.png" height="395px"/> <img src="GitImgs/Compare/r3_3.png" height="318px"/> <img src="GitImgs/Compare/r4_2.png" height="318px"/> <img src="GitImgs/Compare/r11.png" width="790px"/>

(2) Post-process the Text Region from Any Blind Image Super-resolution (BSR) Methods

<img src="./GitImgs/Compare/postprocess_2.png" width="785px">
# On the terminal command
textbsr -i [LR_TEXT_PATH] -b [AnyBSR_Results_PATH] -s

or

# On the python environment
from textbsr import textbsr
textbsr.bsr(input_path='./testsets/LQs', bg_path='./testsets/AnyBSRResults', save_text=True)

When [AnyBSR_Results_PATH] is None, we only restore the text region and paste it back to the LR input, with the background region unchanged.

Real-world LR Text ImageAnyBSR MethodPost-process using our textbsr
<img src="./GitImgs/LR/test1.jpg" width="250px"><img src="./GitImgs/RealESRGAN/test1.jpg" width="250px"><img src="./GitImgs/Compare/new.png" width="250px">
<img src="./GitImgs/LR/test42.png" width="250px"><img src="./GitImgs/RealESRGAN/test42.png" width="250px"><img src="./GitImgs/Ours/test4_BSRGANText2.png" width="250px">
<img src="./GitImgs/LR/00426.png" width="250px"><img src="./GitImgs/RealESRGAN/00426_out.png" width="250px"><img src="./GitImgs/Ours/00426_BSRGANText.png" width="250px">

<img src="GitImgs/Compare/r7.png" width="790px"/> <img src="GitImgs/Compare/r10.png" width="790px"/>


(3) Example for restoring the aligned text region

# On the terminal command
textbsr -i [LR_TEXT_PATH] -a

or

# On the python environment
from textbsr import textbsr
textbsr.bsr(input_path='./testsets/LQs', aligned=True)
Aligned LR Text ImageTextBSR
<img src="./GitImgs/Ours/test5_patch_5i.png" width="395px"><img src="./GitImgs/Ours/test5_patch_5o.png" width="395px">
<img src="./GitImgs/Compare/en12_patch_4_LR.png" width="395px"><img src="./GitImgs/Compare/en12_patch_4_SR.png" width="395px">

Acknowledgement

This project is built based on BSRGAN. We use cnstd for Chinese and English text detection.

:bookmark_tabs: Citation

If you find this package helpful, please kindly consider citing our CVPR23 paper MARCONet:

@InProceedings{li2023marconet,
author = {Li, Xiaoming and Zuo, Wangmeng and Loy, Chen Change},
title = {Learning Generative Structure Prior for Blind Text Image Super-resolution},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {2023}
}

:scroll: License

This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.

<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a>