Home

Awesome

Leveraging Text Localization for Scene Text Removal via Text-aware Masked Image Modeling

This is a pytorch implementation for paper TMIM

Installation

1.Requirements

conda create -n tmim python==3.8.12
conda activate tmim
pip install --upgrade pip
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html 
pip install -r requirements.txt

2.Datasets

Models

ModelMethodPSNRMSSIMMSEAGEDownload
Uformer-BPretrained36.6697.660.06371.70uformer_b_tmim.pth
Uformer-BFintuned37.4297.700.04591.52uformer_b_tmim_str.pth
PERTPretrained34.5196.630.12312.11pert_tmim.pth
PERTFintuned35.6697.180.07291.76pert_tmim_str.pth
EraseNetPretrained34.2597.030.11412.23erasenet_tmim.pth
EraseNetFintuned35.4797.300.07651.95erasenet_tmim_str.pth

Inference

python -m torch.distributed.launch --master_port 29501 --nproc_per_node=1 demo.py --cfg configs/uformer_b_str.py --resume path/to/uformer_b_tmim_str.pth --test-dir path/to/image/folder --visualize-dir path/to/result/folder

Training and Testing

1.Pretraining

python -m torch.distributed.launch --master_port 29501 --nproc_per_node=8 train.py --cfg configs/uformer_b_tmim.py --ckpt-name uformer_b_tmim --save-log 
python test.py --cfg configs/uformer_b_tmim.py --ckpt-name uformer_b_tmim/latest.pth --save-log --visualize

2.Finetuning

python -m torch.distributed.launch --master_port 29501} --nproc_per_node=8 train.py --cfg configs/uformer_b_str.py --ckpt-name uformer_b_tmim_str --save-log --resume 'ckpt/uformer_b_tmim/latest.pth'
python test.py --cfg configs/uformer_b_str.py --ckpt-name uformer_b_tmim_str/latest.pth --save-log --visualize