Awesome
Recommendations of Diffusion for Text-Image
This repository contains a paper collection of recent diffusion models for text-image generation tasks.
📖 Table of Contents 👀
- Document Restoration
- Font Generation
- Text-to-Image (Visual Text Generation)
- Artistic Font Generation
- Text-Image Removal
- Text-Image Super Resolution
- Text-Image Editing
- Inpainting
- Handwritten Generation
- Scene Text Recognition
- Scene Text Detection
Document Restoration
- 🔥🔥🔥Predicting the Original Appearance of Damaged Historical Documents (AAAI 2025)
Font Generation
-
DiffCJK: Conditional Diffusion Model for High-Quality and Wide-coverage CJK Character Generation (Apr. 2024)
-
Font Style Interpolation with Diffusion Models (Fec., 2024)
-
🔥🔥🔥 FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning (AAAI2024)
-
Diff-Font: Diffusion Model for Robust One-Shot Font Generation (Dec., 2022)
Text-to-Image (Visual Text Generation)
-
GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models (Jul, 2024)
-
Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering (Jul, 2024)
-
High Fidelity Scene Text Synthesis (Dec, 2023)
-
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering (Mar, 2024)
-
Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model (Dec, 2023)
-
UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models (Dec, 2023)
-
TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering (Nov, 2023)
-
ANYTEXT: MULTILINGUAL VISUAL TEXT GENERATION AND EDITING (Nov, 2023)
-
TextDiffuser: Diffusion Models as Text Painters (May, 2023)
-
GlyphControl: Glyph Conditional Control for Visual Text Generation (May, 2023)
-
IF (Apr., 2023)
-
GlyphDraw: Seamlessly Rendering Text with Intricate Spatial Structures in Text-to-Image Generation (Mar., 2023)
-
Character-aware models improve visual text rendering (Dec., 2022)
-
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers (Nov., 2022)
-
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding (May, 2022)
Artistic Font Generation
-
Word-As-Image for Semantic Typography (SIGGRAPH2023)
-
ControlNet on Text Effect (Jul., 2023)
-
DS-Fusion: Artistic Typography via Discriminated and Stylized Diffusion (ICCV2023)
Text-Image Removal
Text-Image Super Resolution
-
Diffusion-based Blind Text Image Super-Resolution (CVPR2024)
-
PEAN: A Diffusion-based Prior-Enhanced Attention Network for Scene Text Image Super-Resolution (Nov. 2023)
-
RECOGNITION-GUIDED DIFFUSION MODEL FOR SCENE TEXT IMAGE SUPER-RESOLUTION (Nov. 2023)
-
Scene Text Image Super-resolution based on Text-conditional Diffusion Models (Nov. 2023)
-
DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior (Aug. 2023)
-
TextDiff: Mask-Guided Residual Diffusion Models for Scene Text Image Super-Resolution (Aug., 2023)
-
DocDiff: Document Enhancement via Residual Diffusion Models (ACMMM2023)
-
STIRER: A Unified Model for Low-Resolution Scene Text Image Recovery and Recognition (ACMMM2023)
Text-Image Editing
-
On Manipulating Scene Text in the Wild with Diffusion Models (WACV2024)
-
DiffUTE: Universal Text Editing Diffusion Model (May, 2023)
-
Improving Diffusion Models for Scene Text Editing with Dual Encoders (Apr., 2023)
Inpainting
Handwritten Generation
-
Conditional Text Image Generation with Diffusion Models (CVPR2023)
-
ChiroDiff: Modelling chirographic data with Diffusion Models (ICLR2023)
-
Zero-shot Generation of Training Data with Denoising Diffusion Probabilistic Model for Handwritten Chinese Character Recognition (May, 2023)
-
WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models (ICDAR2023)
-
Diffusion models for Handwriting Generation (Nov., 2020)
Scene Text Recognition
- DiffusionSTR: Diffusion Model for Scene Text Recognition (ICIP2023)
- IPAD: Iterative, Parallel, and Diffusion-based
Network for Scene Text Recognition (TPAMI2023)