Text Generation
PEFT
Safetensors
Chinese
English
lora
tool-selection
tool-call
guardrail
chinese
traditional-chinese
fine-tuned
qwen2
conversational
Instructions to use GOSHUNCLE/tool_call_validator_zh with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use GOSHUNCLE/tool_call_validator_zh with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B-Instruct") model = PeftModel.from_pretrained(base_model, "GOSHUNCLE/tool_call_validator_zh") - Notebooks
- Google Colab
- Kaggle
Upload README.md
Browse files
README.md
CHANGED
|
@@ -22,14 +22,16 @@ tags:
|
|
| 22 |
|
| 23 |
# tool_call_validator_zh
|
| 24 |
|
| 25 |
-
> LoRA fine-tune of Qwen2.5-3B-Instruct
|
| 26 |
> Traditional Chinese tool-call validator (guardrail) — LoRA fine-tune of Qwen2.5-3B-Instruct
|
| 27 |
|
|
|
|
|
|
|
| 28 |
---
|
| 29 |
|
| 30 |
## 中文說明
|
| 31 |
|
| 32 |
-
本模型是針對 **Tool Call Validation** 場景微調的繁體中文模型。基於 [Qwen/Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) 用 LoRA 訓練,能夠:
|
| 33 |
|
| 34 |
1. 讀取使用者請求(user prompt)與多個候選工具的 description
|
| 35 |
2. 透過語意比對選出最適合的工具,或在無合適工具時拒絕匹配
|
|
@@ -170,6 +172,49 @@ Invalid 時 fallback:`{signal: "abstain", confidence: "low", selected_tool: nu
|
|
| 170 |
|
| 171 |
訓練樣本 reasoning 風格偏向「翻譯式書面語」(如 memory_2 IC Firewall),對極口語化的輸入可能略顯生硬。
|
| 172 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 173 |
### Disclaimer
|
| 174 |
|
| 175 |
訓練資料中的工具名稱(web_search 等 8 個)為**合成虛構**,用於 demonstrate 方法論。所有股票標的、人物、地點等 slot pool 內容皆為公開資訊範例,無暗示任何商業關係。
|
|
@@ -216,6 +261,45 @@ The base Qwen2.5-3B-Instruct achieves 57% tool accuracy and 48% confidence accur
|
|
| 216 |
| Training time | ~4.4 hours |
|
| 217 |
| Best eval_loss | 0.0051 |
|
| 218 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 219 |
### Methodology Inheritance
|
| 220 |
|
| 221 |
This model inherits the methodology from [GOSHUNCLE/ic_content_firewall_zh](https://huggingface.co/GOSHUNCLE/ic_content_firewall_zh) (IC design industry content firewall):
|
|
|
|
| 22 |
|
| 23 |
# tool_call_validator_zh
|
| 24 |
|
| 25 |
+
> 中文 (繁體) Tool Call 驗證 / Guardrail 模型 · LoRA fine-tune of Qwen2.5-3B-Instruct
|
| 26 |
> Traditional Chinese tool-call validator (guardrail) — LoRA fine-tune of Qwen2.5-3B-Instruct
|
| 27 |
|
| 28 |
+
**🚀 [Try the live demo →](https://huggingface.co/spaces/GOSHUNCLE/tool_call_validator_zh_demo)** · **📦 [Methodology lineage: ic_content_firewall_zh](https://huggingface.co/GOSHUNCLE/ic_content_firewall_zh)**
|
| 29 |
+
|
| 30 |
---
|
| 31 |
|
| 32 |
## 中文說明
|
| 33 |
|
| 34 |
+
本模型是針對 **Tool Call Validation / Guardrail** 場景微調的繁體中文模型。基於 [Qwen/Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) 用 LoRA 訓練,能夠:
|
| 35 |
|
| 36 |
1. 讀取使用者請求(user prompt)與多個候選工具的 description
|
| 37 |
2. 透過語意比對選出最適合的工具,或在無合適工具時拒絕匹配
|
|
|
|
| 172 |
|
| 173 |
訓練樣本 reasoning 風格偏向「翻譯式書面語」(如 memory_2 IC Firewall),對極口語化的輸入可能略顯生硬。
|
| 174 |
|
| 175 |
+
### Deployment Notes(部署注意事項)
|
| 176 |
+
|
| 177 |
+
#### Gradio + huggingface_hub 相容性 shim
|
| 178 |
+
|
| 179 |
+
若要將本模型整合進 **Gradio app**(包括 HF Space),請在 `import gradio` 之前加入以下 monkey-patch,避免 `ImportError: cannot import name 'HfFolder' from 'huggingface_hub'`:
|
| 180 |
+
|
| 181 |
+
```python
|
| 182 |
+
# === Compat shim:huggingface_hub >= 1.0 移除了 HfFolder,但 gradio (4.x 與 5.x) 還在用 ===
|
| 183 |
+
import huggingface_hub as _hf_hub
|
| 184 |
+
if not hasattr(_hf_hub, "HfFolder"):
|
| 185 |
+
class _HfFolderShim:
|
| 186 |
+
@staticmethod
|
| 187 |
+
def get_token():
|
| 188 |
+
try: return _hf_hub.get_token()
|
| 189 |
+
except Exception: return None
|
| 190 |
+
@staticmethod
|
| 191 |
+
def save_token(token):
|
| 192 |
+
try: _hf_hub.login(token=token)
|
| 193 |
+
except Exception: pass
|
| 194 |
+
@staticmethod
|
| 195 |
+
def delete_token():
|
| 196 |
+
try: _hf_hub.logout()
|
| 197 |
+
except Exception: pass
|
| 198 |
+
_hf_hub.HfFolder = _HfFolderShim
|
| 199 |
+
|
| 200 |
+
import gradio as gr # safe now
|
| 201 |
+
```
|
| 202 |
+
|
| 203 |
+
完整實例見 [Demo Space app.py](https://huggingface.co/spaces/GOSHUNCLE/tool_call_validator_zh_demo/blob/main/app.py)。
|
| 204 |
+
|
| 205 |
+
#### 部署平台建議
|
| 206 |
+
|
| 207 |
+
| 平台 | 推論時間/筆 | 適用 |
|
| 208 |
+
|---|---|---|
|
| 209 |
+
| HF 免費 CPU Space (2 vCPU, 16 GB) | 90-180 秒 | Demo / 驗證 |
|
| 210 |
+
| HF T4 GPU Space (~$0.40/hr) | 1-3 秒 | Light production |
|
| 211 |
+
| 本機 NVIDIA GPU (RTX 3060+) | 1-2 秒 | Self-host |
|
| 212 |
+
| 本機 CPU (Intel Core Ultra 7+) | 30-60 秒 | Offline batch |
|
| 213 |
+
|
| 214 |
+
#### GGUF 量化(未實作,v2 backlog)
|
| 215 |
+
|
| 216 |
+
如需更快 CPU 推論,可考慮 merge LoRA 後轉 GGUF Q4,預估 CPU 推論可降至 ~5-10 秒/筆。
|
| 217 |
+
|
| 218 |
### Disclaimer
|
| 219 |
|
| 220 |
訓練資料中的工具名稱(web_search 等 8 個)為**合成虛構**,用於 demonstrate 方法論。所有股票標的、人物、地點等 slot pool 內容皆為公開資訊範例,無暗示任何商業關係。
|
|
|
|
| 261 |
| Training time | ~4.4 hours |
|
| 262 |
| Best eval_loss | 0.0051 |
|
| 263 |
|
| 264 |
+
### Deployment Notes
|
| 265 |
+
|
| 266 |
+
#### Gradio compatibility shim
|
| 267 |
+
|
| 268 |
+
If you integrate this model into a **Gradio app** (including HF Spaces), add this monkey-patch before `import gradio` to avoid `ImportError: cannot import name 'HfFolder' from 'huggingface_hub'`:
|
| 269 |
+
|
| 270 |
+
```python
|
| 271 |
+
# Compat shim: huggingface_hub >= 1.0 removed HfFolder, but gradio (4.x and 5.x) still imports it
|
| 272 |
+
import huggingface_hub as _hf_hub
|
| 273 |
+
if not hasattr(_hf_hub, "HfFolder"):
|
| 274 |
+
class _HfFolderShim:
|
| 275 |
+
@staticmethod
|
| 276 |
+
def get_token():
|
| 277 |
+
try: return _hf_hub.get_token()
|
| 278 |
+
except Exception: return None
|
| 279 |
+
@staticmethod
|
| 280 |
+
def save_token(token):
|
| 281 |
+
try: _hf_hub.login(token=token)
|
| 282 |
+
except Exception: pass
|
| 283 |
+
@staticmethod
|
| 284 |
+
def delete_token():
|
| 285 |
+
try: _hf_hub.logout()
|
| 286 |
+
except Exception: pass
|
| 287 |
+
_hf_hub.HfFolder = _HfFolderShim
|
| 288 |
+
|
| 289 |
+
import gradio as gr # safe now
|
| 290 |
+
```
|
| 291 |
+
|
| 292 |
+
See full example in [Demo Space app.py](https://huggingface.co/spaces/GOSHUNCLE/tool_call_validator_zh_demo/blob/main/app.py).
|
| 293 |
+
|
| 294 |
+
#### Inference latency by platform
|
| 295 |
+
|
| 296 |
+
| Platform | Latency / sample | Use case |
|
| 297 |
+
|---|---|---|
|
| 298 |
+
| HF free CPU Space (2 vCPU, 16 GB) | 90-180 s | Demo / validation |
|
| 299 |
+
| HF T4 GPU Space (~$0.40/hr) | 1-3 s | Light production |
|
| 300 |
+
| Local NVIDIA GPU (RTX 3060+) | 1-2 s | Self-host |
|
| 301 |
+
| Local CPU (Intel Core Ultra 7+) | 30-60 s | Offline batch |
|
| 302 |
+
|
| 303 |
### Methodology Inheritance
|
| 304 |
|
| 305 |
This model inherits the methodology from [GOSHUNCLE/ic_content_firewall_zh](https://huggingface.co/GOSHUNCLE/ic_content_firewall_zh) (IC design industry content firewall):
|