Instructions to use CIawevy/TextPecker-8B-InternVL3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use CIawevy/TextPecker-8B-InternVL3 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="CIawevy/TextPecker-8B-InternVL3", trust_remote_code=True)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("CIawevy/TextPecker-8B-InternVL3", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use CIawevy/TextPecker-8B-InternVL3 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "CIawevy/TextPecker-8B-InternVL3"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CIawevy/TextPecker-8B-InternVL3",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/CIawevy/TextPecker-8B-InternVL3

SGLang

How to use CIawevy/TextPecker-8B-InternVL3 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "CIawevy/TextPecker-8B-InternVL3" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CIawevy/TextPecker-8B-InternVL3",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "CIawevy/TextPecker-8B-InternVL3" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CIawevy/TextPecker-8B-InternVL3",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use CIawevy/TextPecker-8B-InternVL3 with Docker Model Runner:
```
docker model run hf.co/CIawevy/TextPecker-8B-InternVL3
```

TextPecker-8B-InternVL3

TextPecker-8B-InternVL3 is an evaluator model presented in the paper TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering.

While standard Multimodal LLMs often fail to notice fine-grained text errors like distortion or misalignment in generated images, TextPecker is specifically designed to perceive and quantify these structural anomalies to provide reliable reward signals for RL-based optimization of text-to-image models.

This checkpoint is based on the InternVL3-8B-Instruct architecture and was trained using the ms-swift framework on the TextPecker-1.5M dataset.

Model Details

Developed by: Hanshen Zhu, Yuliang Liu, et al. (Huazhong University of Science & Technology and ByteDance)
Model Type: Multimodal Large Language Model (MLLM)
Base Model: OpenGVLab/InternVL3-8B-Instruct
Task: Image-to-Text (Structural Anomaly Perception / OCR Evaluator)
License: Apache 2.0

Model Sources

Repository: https://github.com/CIawevy/TextPecker
Paper: https://huggingface.co/papers/2602.20903
Dataset: CIawevy/TextPecker-1.5M

Uses

TextPecker can be used to evaluate text structural quality and semantic consistency for text generation or editing scenarios. It helps bridge the gap in Visual Text Rendering (VTR) optimization by providing reliable feedback on character-level structural fidelity.

To use the model for deployment or evaluation, please follow the instructions in the official repository:

Citation

If you find TextPecker useful in your research, please cite:

@article{zhu2026TextPecker,
  title   = {TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering},
  author  = {Zhu, Hanshen and Liu, Yuliang and Wu, Xuecheng and Wang, An-Lan and Feng, Hao and Yang, Dingkang and Feng, Chao and Huang, Can and Tang, Jingqun and Bai, Xiang},
  journal = {arXiv preprint arXiv:2602.20903},
  year    = {2026}
}