Instructions to use twnlp/ChineseErrorCorrector4-4B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use twnlp/ChineseErrorCorrector4-4B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="twnlp/ChineseErrorCorrector4-4B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("twnlp/ChineseErrorCorrector4-4B")
model = AutoModelForCausalLM.from_pretrained("twnlp/ChineseErrorCorrector4-4B", device_map="auto")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use twnlp/ChineseErrorCorrector4-4B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "twnlp/ChineseErrorCorrector4-4B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "twnlp/ChineseErrorCorrector4-4B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/twnlp/ChineseErrorCorrector4-4B

SGLang

How to use twnlp/ChineseErrorCorrector4-4B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "twnlp/ChineseErrorCorrector4-4B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "twnlp/ChineseErrorCorrector4-4B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "twnlp/ChineseErrorCorrector4-4B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "twnlp/ChineseErrorCorrector4-4B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use twnlp/ChineseErrorCorrector4-4B with Docker Model Runner:
```
docker model run hf.co/twnlp/ChineseErrorCorrector4-4B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

ChineseErrorCorrector4-4B (CSRP)

ChineseErrorCorrector4-4B is a high-precision Chinese Grammatical Error Correction (CGEC) and Chinese Spelling Check (CSC) model, presented in the paper CSRP: Chain-of-Thought Reasoning for Chinese Text Correction via Reinforcement Learning with Efficiency-Aware Rewards.

🔥 Recent Updates

Date	Update
2026-05	🎉 Paper "CSRP: Chain-of-Thought Reasoning for Chinese Text Correction via Reinforcement Learning with Efficiency-Aware Rewards" accepted as Oral at ACL 2026
2026-05	🚀 Released ChineseErrorCorrector4-4B, achieving new SOTA on both NACGEC and CSCD benchmarks

💡 Introduction

ChineseErrorCorrector4-4B is built on the CSRP (CPT → SFT → RL) three-stage training framework.

The Problem: Over-Correction Bias

Traditional LLM-based correction systems often suffer from over-correction bias — models unnecessarily paraphrase correct text rather than leaving it untouched. CSRP resolves this by calibrating decision boundaries through a structured curriculum:

Stage	Name	Description
Phase I	Balanced Continued Pre-training (CPT)	Internalizes linguistic priors using 5.9M samples with an 8:2 mixture of general and correction-specific data
Phase II	Rationale-Augmented SFT	Distills Chain-of-Thought reasoning paths to guide the model in diagnosing error types before executing corrections
Phase III	Efficiency-Aware Policy Alignment	Uses GRPO with a novel Efficiency-Aware Reward (EAR) to penalize unnecessary edits and reward surgical precision

📊 Benchmark Results

榜单一：中文语法纠错（CGEC）— NACGEC 基准

针对原生中文及学习者文本，CSRP (4B) 斩获新 SOTA，$F_{0.5}$ 高达 50.99，显著超越此前最优专业大模型。

模型 (Scale)	准确率 Precision	召回率 Recall	$F_{0.5}$ (核心指标)
BART	34.67	41.88	35.91
HW-CGEC	50.95	32.29	45.26
ScholarGEC (14B)	45.08	59.33	47.35
CEC3 (4B)	54.20	34.75	48.74
CSRP (4B) [Ours] ✅	57.17	35.60	50.99

榜单二：中文拼写检查（CSC）— CSCD 基准

CSRP 在字符级纠错 F1 上同样展现出强劲统治力，达到惊人的 59.61，全面超越 GPT-4。

模型	Correction F1
BERT	25.49
SoftMask	44.48
SMBERT	44.67
MDCSpell+ARM	48.93
GPT-4 (Few-shot)	54.41
CSRP (4B) [Ours] ✅	59.61

🛠️ Quick Start

Requirements

pip install -U transformers torch

Note: Requires transformers >= 4.51.0 for Qwen3 architecture support.

Inference with Transformers

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "twnlp/ChineseErrorCorrector4-4B"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

# Professional instruction template
instruction = (
    "假如你是一名专业的纠错专家，请分析输入句子的语法错误类型和修改原因，"
    "并只输出纠正后的语句，错误类型如下：错别字、词语搭配错误、词性错误、"
    "语序错误、成分残缺、成分赘余、关联词使用错误、指代不明、语义逻辑不通、无误。"
)

text_input = "下个星期，我跟我朋唷打算去法国玩儿。"

messages = [
    {"role": "system", "content": instruction},
    {"role": "user",   "content": text_input}
]

text = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512,
    do_sample=False,
    repetition_penalty=1.1
)

response = tokenizer.decode(
    generated_ids[0][len(model_inputs.input_ids[0]):],
    skip_special_tokens=True
)
print(response)

📝 Output Example

Input:

下个星期，我跟我朋唷打算去法国玩儿。

Model Output:

<think>
错误类型：错别字
修改原因：原句中的"朋唷"是错误写法，正确应为"朋友"。
"唷"是语气助词，不能用于此处指代同伴。
正确句使用"朋友"准确表达了与说话者一同前往的人，避免了因错别字造成的语义误解。
</think>

下个星期，我跟我朋友打算去法国玩儿。

📜 License

This project is released under the Apache 2.0 License.

Citation

如果本工作对您有帮助，欢迎引用：

@misc{tian2026csrpchainofthoughtreasoningchinese,
      title={CSRP: Chain-of-Thought Reasoning for Chinese Text Correction via Reinforcement Learning with Efficiency-Aware Rewards}, 
      author={Wei Tian and Yuhao Zhou and Man Lan},
      year={2026},
      eprint={2606.00020},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2606.00020}, 
}