Home

Awesome

<div align=center>

FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning

</div>

FontDiffuser_LOGO

<div align=center>

arXiv preprint Gradio demo Homepage Code

</div> <p align="center"> <strong><a href="#πŸ”₯-model-zoo">πŸ”₯ Model Zoo </a></strong> β€’ <strong><a href="#πŸ› οΈ-installation">πŸ› οΈ Installation </a></strong> β€’ <strong><a href="#πŸ‹οΈ-training">πŸ‹οΈ Training</a></strong> β€’ <strong><a href="#πŸ“Ί-sampling">πŸ“Ί Sampling</a></strong> β€’ <strong><a href="#πŸ“±-run-webui">πŸ“± Run WebUI</a></strong> </p>

🌟 Highlights

Vis_1 Vis_2

πŸ“… News

πŸ”₯ Model Zoo

Modelchekcpointstatus
FontDiffuerGoogleDrive / BaiduYun:gexgReleased
SCRGoogleDrive / BaiduYun:gexgReleased

🚧 TODO List

πŸ› οΈ Installation

Prerequisites (Recommended)

Environment Setup

Clone this repo:

git clone https://github.com/yeungchenwa/FontDiffuser.git

Step 0: Download and install Miniconda from the official website.

Step 1: Create a conda environment and activate it.

conda create -n fontdiffuser python=3.9 -y
conda activate fontdiffuser

Step 2: Install related version Pytorch following here.

# Suggested
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117

Step 3: Install the required packages.

pip install -r requirements.txt

πŸ‹οΈ Training

Data Construction

The training data files tree should be (The data examples are shown in directory data_examples/train/):

β”œβ”€β”€data_examples
β”‚   └── train
β”‚       β”œβ”€β”€ ContentImage
β”‚       β”‚   β”œβ”€β”€ char0.png
β”‚       β”‚   β”œβ”€β”€ char1.png
β”‚       β”‚   β”œβ”€β”€ char2.png
β”‚       β”‚   └── ...
β”‚       └── TargetImage.png
β”‚           β”œβ”€β”€ style0
β”‚           β”‚     β”œβ”€β”€style0+char0.png
β”‚           β”‚     β”œβ”€β”€style0+char1.png
β”‚           β”‚     └── ...
β”‚           β”œβ”€β”€ style1
β”‚           β”‚     β”œβ”€β”€style1+char0.png
β”‚           β”‚     β”œβ”€β”€style1+char1.png
β”‚           β”‚     └── ...
β”‚           β”œβ”€β”€ style2
β”‚           β”‚     β”œβ”€β”€style2+char0.png
β”‚           β”‚     β”œβ”€β”€style2+char1.png
β”‚           β”‚     └── ...
β”‚           └── ...

Training Configuration

Before running the training script (including the following three modes), you should set the training configuration, such as distributed training, through:

accelerate config

Training - Pretraining of SCR

Coming Soon ...

Training - Phase 1

sh train_phase_1.sh

Training - Phase 2

After the phase 2 training, you should put the trained checkpoint files (unet.pth, content_encoder.pth, and style_encoder.pth) to the directory phase_1_ckpt. During phase 2, these parameters will be resumed.

sh train_phase_2.sh

πŸ“Ί Sampling

Step 1 => Prepare the checkpoint

Option (1) Download the checkpoint following GoogleDrive / BaiduYun:gexg, then put the ckpt to the root directory, including the files unet.pth, content_encoder.pth, and style_encoder.pth.
Option (2) Put your re-training checkpoint folder ckpt to the root directory, including the files unet.pth, content_encoder.pth, and style_encoder.pth.

Step 2 => Run the script

(1) Sampling image from content image and reference image.

sh script/sample_content_image.sh

(2) Sampling image from content character.
Note Maybe you need a ttf file that contains numerous Chinese characters, you can download it from BaiduYun:wrth.

sh script/sample_content_character.sh

πŸ“± Run WebUI

(1) Sampling by FontDiffuser

gradio gradio_app.py

Example:

<p align="center"> <img src="figures/gradio_fontdiffuer_new.png" width="80%" height="auto"> </p>

(2) Sampling by FontDiffuser and Rendering by InstructPix2Pix

Coming Soon ...

πŸŒ„ Gallery

Characters of hard level of complexity

vis_hard

Characters of medium level of complexity

vis_medium

Characters of easy level of complexity

vis_easy

Cross-Lingual Generation (Chinese to Korean)

vis_korean

πŸ’™ Acknowledgement

Copyright

Citation

@inproceedings{yang2024fontdiffuser,
  title={FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning},
  author={Yang, Zhenhua and Peng, Dezhi and Kong, Yuxin and Zhang, Yuyi and Yao, Cong and Jin, Lianwen},
  booktitle={Proceedings of the AAAI conference on artificial intelligence},
  year={2024}
}

⭐ Star Rising

Star Rising