Text Classification
Safetensors
Chinese
qwen2
ChineseGuard-7B / README.md
sadgaj's picture
Update README.md
81cdd92 verified
---
license: cc-by-nc-4.0
language:
- zh
base_model:
- Qwen/Qwen2.5-7B-Instruct
pipeline_tag: text-classification
datasets:
- zjunlp/ChineseHarm-bench
---
## 🚀Installation
1. Clone the repositories:
```bash
git clone https://github.com/zjunlp/ChineseHarm-bench
cd ChineseHarm-bench
git clone https://github.com/hiyouga/LLaMA-Factory
```
2. Install dependencies:
```bash
cd LLaMA-Factory
pip install -e ".[torch,metrics]"
```
## 📚Inference
We release the following variants of our harmful content detection model:
🔹 Single Inference (Example)
Run single-input inference using the ChineseGuard-7B model:
```
SCRIPT_PATH="../infer/single_infer.py"
model_name="zjunlp/ChineseGuard-7B"
text="代发短信,有想做的联系我,无押金"
python $SCRIPT_PATH \
--model_name $model_name \
--text $text
```
🔸 Batch Inference (Multi-NPU or Multi-GPU)
To run inference on the entire ChineseHarm-Bench using ChineseGuard-7B and 8 NPUs:
```
SCRIPT_PATH="../infer/batch_infer.py"
model_name="zjunlp/ChineseHarm-7B"
file_name="../benchmark/bench.json"
output_file="../benchmark/bench_ChineseHarm-7B.json"
python $SCRIPT_PATH \
--model_name $model_name \
--file_name $file_name \
--output_file $output_file \
--num_npus 8
```
> For more configuration options (e.g., batch size, device selection, custom prompt templates), please refer to `single_infer.py` and `batch_infer.py`.
>
> **Note:** The inference scripts support both NPU and GPU devices.
## 🚩Citation
Please cite our repository if you use ChineseGuard in your work. Thanks!
```bibtex
@misc{liu2025chineseharmbenchchineseharmfulcontent,
title={ChineseHarm-Bench: A Chinese Harmful Content Detection Benchmark},
author={Kangwei Liu and Siyuan Cheng and Bozhong Tian and Xiaozhuan Liang and Yuyang Yin and Meng Han and Ningyu Zhang and Bryan Hooi and Xi Chen and Shumin Deng},
year={2025},
eprint={2506.10960},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2506.10960},
}
```