TextPecker-8B-Qwen3VL
TextPecker is a structural anomaly perceptive model designed to enhance Visual Text Rendering (VTR). It addresses a critical bottleneck where standard MLLMs and OCR models fail to perceive structural anomalies such as distortion, blurriness, and misalignment in generated text. This model acts as a plug-and-play evaluator and reward signal for RL-based optimization (e.g., using Flow-GRPO), enabling the generation of structurally faithful visual text.
This checkpoint is built upon the Qwen3-VL-8B-Instruct architecture and was trained using ms-swift.
Model Details
- Developed by: Hanshen Zhu, Yuliang Liu, Xuecheng Wu, An-Lan Wang, Hao Feng, Dingkang Yang, Chao Feng, Can Huang, Jingqun Tang, and Xiang Bai.
- Model Type: Multimodal Large Language Model (MLLM) / Visual Text Rendering Evaluator
- Backbone Model: Qwen/Qwen3-VL-8B-Instruct
- Paper: TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering
- Repository: CIawevy/TextPecker
- License: Apache 2.0
Uses
TextPecker can be used to evaluate text structural quality and semantic consistency for text-to-image generation or editing tasks. It is particularly useful for:
- Structural Anomaly Quantification: Identifying distortion, blurriness, and misalignment in rendered text.
- Reward Modeling: Providing reward signals for Reinforcement Learning (RL) to improve text rendering in generators like Flux or SD3.5.
To use this model, please follow the official deployment and testing instructions:
Citation
If you find TextPecker useful in your research or work, please cite the paper:
@article{zhu2026TextPecker,
title = {TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering},
author = {Zhu, Hanshen and Liu, Yuliang and Wu, Xuecheng and Wang, An-Lan and Feng, Hao and Dingkang Yang and Chao Feng and Can Huang and Jingqun Tang and Xiang Bai},
journal = {arXiv preprint arXiv:2602.20903},
year = {2026}
}
- Downloads last month
- 63
Model tree for CIawevy/TextPecker-8B-Qwen3VL
Base model
Qwen/Qwen3-VL-8B-Instruct