Home

Awesome

<h2 align="center"><a href="">One-DM:One-Shot Diffusion Mimicker for Handwritten Text Generation</a></h2> <div align="center"> <a href="https://arxiv.org/abs/2409.04004"><img src="https://img.shields.io/badge/Arxiv-2409.04004-red"></a> <a href=""><img src="https://img.shields.io/badge/Pytorch-1.13-green"></a> <a href="https://github.com/dailenson/One-DM/blob/main/LICENSE"><img src="https://img.shields.io/badge/License-MIT-blue"></a> </div> <br> <p align="center"> <img src="assets/js79ccvr33.png" style="width: 200px; height: 200px; margin: 0 auto;"> </p> <!-- <a href="https://github.com/Ucas-HaoranWei/GOT-OCR2.0/"><img src="https://img.shields.io/badge/Project-Page-Green"></a> --> <!-- <a href="https://github.com/Ucas-HaoranWei/GOT-OCR2.0/blob/main/assets/wechat.jpg"><img src="https://img.shields.io/badge/Wechat-blue"></a> <a href="https://zhuanlan.zhihu.com/p/718163422"><img src="https://img.shields.io/badge/zhihu-red"></a> --> <!-- [Gang Dai](https://scholar.google.com/citations?user=J4naK0MAAAAJ&hl=en), Yifan Zhang, Quhui Ke, Qiangya Guo, Lingyu Kong, Yanming Xu, [Zheng Ge](https://joker316701882.github.io/), Liang Zhao, [Jianjian Sun](https://scholar.google.com/citations?user=MVZrGkYAAAAJ&hl=en), [Yuang Peng](https://scholar.google.com.hk/citations?user=J0ko04IAAAAJ&hl=zh-CN&oi=ao), Chunrui Han, [Xiangyu Zhang](https://scholar.google.com/citations?user=yuB-cfoAAAAJ&hl=en) --> <!-- <p align="center"> <img src="assets/got_logo.png" style="width: 200px" align=center> </p> -->

🌟 Introduction

<div style="display: flex; flex-direction: column; align-items: center; "> <img src="assets/overview_v2.png" style="width: 100%;"> </div> <p align="center" style="margin-bottom: 10px;"> Overview of the proposed One-DM </p>

🌠 Release

🔨 Requirements

conda create -n One-DM python=3.8 -y
conda activate One-DM
# install all dependencies
conda env create -f environment.yml

☀️ Datasets

We provide English datasets in Google Drive | Baidu Netdisk | ShiZhi AI. Please download these datasets, uzip them and move the extracted files to /data.

🐳 Model Zoo

ModelGoogle DriveBaidu NetdiskShiZhi AI
Pretrained One-DMGoogle DriveBaidu NetdiskShiZhi AI
Pretrained OCR modelGoogle DriveBaidu NetdiskShiZhi AI
Pretrained Resnet18Google DriveBaidu NetdiskShiZhi AI

Note: Please download these weights, and move them to /model_zoo. (If you cannot access the pre-trained VAE model available on Hugging Face, please refer to the pinned issue for guidance.)

🏋️ Training & Test

CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=2 train.py \
    --feat_model model_zoo/RN18_class_10400.pth \
    --log English
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 train_finetune.py \
    --one_dm ./Saved/IAM64_scratch/English-timestamp/model/epoch-ckpt.pt \
    --ocr_model ./model_zoo/vae_HTR138.pth --log English

Note: Please modify timestamp and epoch according to your own path.

CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 test.py \
   --one_dm ./Saved/IAM64_finetune/English-timestamp/model/epoch-ckpt.pt \
   --generate_type oov_u --dir ./Generated/English

Note: Please modify timestamp and epoch according to your own path.

📺 Exhibition

<p align="center"> <img src="assets/indus-English_v2.png" style="width: 90%" align=center> </p> <p align="center"> <img src="assets/indus-Chinese.png" style="width: 90%" align=center> </p> <p align="center"> <img src="assets/One-DM_result.png" style="width: 100%" align=center> </p> <!-- ![online English](assets/One-DM_result.png) --> <p align="center"> <img src="assets/casia_v4.png" style="width: 90%" align=center> </p> <!-- ![offline Chinese](assets/casia_v4.png) -->

❤️ Citation

If you find our work inspiring or use our codebase in your research, please cite our work:

@inproceedings{one-dm2024,
  title={One-Shot Diffusion Mimicker for Handwritten Text Generation},
  author={Dai, Gang and Zhang, Yifan and Ke, Quhui and Guo, Qiangya and Huang, Shuangping},
  booktitle={European Conference on Computer Vision},
  year={2024}
}

⭐ StarGraph

Star History Chart