GlyphPrinter / README.md

nielsr HF Staff

Add model card for GlyphPrinter

882c47a verified 24 days ago

2.17 kB

pipeline_tag: text-to-image
library_name: diffusers

GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering

Paper | Project Page | GitHub

GlyphPrinter is a preference-based text rendering framework designed to eliminate the reliance on explicit reward models for visual text generation. It addresses common failure cases in existing text-to-image models, such as stroke distortions and incorrect glyphs, especially when rendering complex Chinese characters, multilingual text, or out-of-domain symbols.

Key Features

R-GDPO (Region-Grouped Direct Preference Optimization): A region-based objective that optimizes inter- and intra-sample preferences over annotated regions, substantially enhancing glyph accuracy.
GlyphCorrector Dataset: A specialized dataset with region-level glyph preference annotations.
Regional Reward Guidance (RRG): An inference strategy that samples from an optimal distribution with controllable glyph accuracy.

Usage

To use this model, please follow the installation instructions in the official GitHub repository.

CLI Inference

You can run inference using the provided inference.py script:

# list available saved conditions
python3 inference.py --list-conditions

# run inference using a prompt
python3 inference.py \
  --prompt "The colorful graffiti font <sks1> printed on the street wall" \
  --save-mask

# run inference using a specific condition file
python3 inference.py \
  --condition condition_1.npz \
  --output-dir outputs_inference

Gradio Demo

Alternatively, you can run the interactive Gradio app:

python app.py

Citation

@inproceedings{GlyphPrinter,
        title={{GlyphPrinter}: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering},
        author={Shuai, Xincheng and Li, Ziye and Ding, Henghui and Tao, Dacheng},
        booktitle={CVPR},
        year={2026}
      }