Instructions to use CIawevy/TextPecker-8B-InternVL3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use CIawevy/TextPecker-8B-InternVL3 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="CIawevy/TextPecker-8B-InternVL3", trust_remote_code=True)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("CIawevy/TextPecker-8B-InternVL3", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use CIawevy/TextPecker-8B-InternVL3 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "CIawevy/TextPecker-8B-InternVL3"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CIawevy/TextPecker-8B-InternVL3",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/CIawevy/TextPecker-8B-InternVL3

SGLang

How to use CIawevy/TextPecker-8B-InternVL3 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "CIawevy/TextPecker-8B-InternVL3" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CIawevy/TextPecker-8B-InternVL3",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "CIawevy/TextPecker-8B-InternVL3" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CIawevy/TextPecker-8B-InternVL3",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use CIawevy/TextPecker-8B-InternVL3 with Docker Model Runner:
```
docker model run hf.co/CIawevy/TextPecker-8B-InternVL3
```

Add paper metadata and improve model card

by nielsr HF Staff - opened Feb 26

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+45

-8

Files changed (1) hide show

README.md +45 -8

README.md CHANGED Viewed

@@ -1,17 +1,54 @@
 ---
 base_model: OpenGVLab/InternVL3-8B-Instruct
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-This model is trained using ms-swift.
 ## Model Details
-### Model Sources
-<!-- Provide the basic links for the model. -->
-- **Repository:** https://github.com/CIawevy/TextPecker/tree/main
-- **Paper:** https://www.arxiv.org/pdf/2602.20903
 ## Uses
-To use our model, please following our official repo: [TextPecker_deploy](https://github.com/CIawevy/TextPecker/tree/main) and  [TextPecker_demo](https://github.com/CIawevy/TextPecker/blob/main/eval/TextPecker_eval/demo.py)

 ---
+license: apache-2.0
 base_model: OpenGVLab/InternVL3-8B-Instruct
+library_name: transformers
+pipeline_tag: image-text-to-text
+tags:
+- multimodal
+- ocr
+- vtr
+- text-rendering
+- ms-swift
 ---
+# TextPecker-8B-InternVL3
+TextPecker-8B-InternVL3 is an evaluator model presented in the paper [TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering](https://huggingface.co/papers/2602.20903).
+While standard Multimodal LLMs often fail to notice fine-grained text errors like distortion or misalignment in generated images, TextPecker is specifically designed to perceive and quantify these structural anomalies to provide reliable reward signals for RL-based optimization of text-to-image models.
+This checkpoint is based on the **InternVL3-8B-Instruct** architecture and was trained using the [ms-swift](https://github.com/modelscope/ms-swift) framework on the [TextPecker-1.5M](https://huggingface.co/datasets/CIawevy/TextPecker-1.5M) dataset.
 ## Model Details
+- **Developed by:** Hanshen Zhu, Yuliang Liu, et al. (Huazhong University of Science & Technology and ByteDance)
+- **Model Type:** Multimodal Large Language Model (MLLM)
+- **Base Model:** [OpenGVLab/InternVL3-8B-Instruct](https://huggingface.co/OpenGVLab/InternVL3-8B-Instruct)
+- **Task:** Image-to-Text (Structural Anomaly Perception / OCR Evaluator)
+- **License:** Apache 2.0
+## Model Sources
+- **Repository:** [https://github.com/CIawevy/TextPecker](https://github.com/CIawevy/TextPecker)
+- **Paper:** [https://huggingface.co/papers/2602.20903](https://huggingface.co/papers/2602.20903)
+- **Dataset:** [CIawevy/TextPecker-1.5M](https://huggingface.co/datasets/CIawevy/TextPecker-1.5M)
 ## Uses
+TextPecker can be used to evaluate text structural quality and semantic consistency for text generation or editing scenarios. It helps bridge the gap in Visual Text Rendering (VTR) optimization by providing reliable feedback on character-level structural fidelity.
+To use the model for deployment or evaluation, please follow the instructions in the official repository:
+- [TextPecker Deployment Guide](https://github.com/CIawevy/TextPecker/tree/main)
+- [TextPecker Evaluation Demo](https://github.com/CIawevy/TextPecker/blob/main/eval/TextPecker_eval/demo.py)
+## Citation
+If you find TextPecker useful in your research, please cite:
+```bibtex
+@article{zhu2026TextPecker,
+  title   = {TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering},
+  author  = {Zhu, Hanshen and Liu, Yuliang and Wu, Xuecheng and Wang, An-Lan and Feng, Hao and Yang, Dingkang and Feng, Chao and Huang, Can and Tang, Jingqun and Bai, Xiang},
+  journal = {arXiv preprint arXiv:2602.20903},
+  year    = {2026}
+}
+```
+## Acknowledgement
+Training was conducted using the **ms-swift** framework. We thank the authors of InternVL and ms-swift for their excellent open-source contributions.