Home

Awesome

E2E_TIT_With_MT

E2E_TIT_With_MT: End-to-end Text Image Translation with Machine Translation.

The official repository for ICPR 2022 main conference paper:

1. Introduction

End-to-end text image translation (TIT), which aims at translating the source language embedded in images to the target language, has attracted intensive attention in recent research. However, data sparsity limits the performance of end- to-end text image translation. Multi-task learning is a nontrivial way to alleviate this problem via exploring knowledge from complementary related tasks. In this paper, we propose a novel text translation enhanced text image translation, which trains the end-to-end model with text translation as an auxiliary task. By sharing model parameters and multi-task training, our model is able to take full advantage of easily-available large-scale text parallel corpus. Extensive experimental results show our proposed method outperforms existing end-to-end methods, and the joint multi-task learning with both text translation and recognition tasks achieves better results, proving MT and OCR auxiliary tasks are complementary.

<img src="./Figures/model.png" style="zoom:180%;" />

2. Usage

2.1 Requirements

2.2 Train the Model

bash ./train_model_guide.sh

2.3 Evaluate the Model

bash ./test_model_guide.sh

2.4 Datasets

We have conducted synthetic, subtitle and street-view datasets. If you want to utilize these datasets for research, please contact with cong.ma@nlpr.ia.ac.cn.

3. Acknowledgement

The reference code of the provided methods are:

We thanks for all these researchers who have made their codes publicly available.

4. Citation

If you want to cite our paper, please use this bibtex version:

If you have any issues, please contact with Email.