Update README_zh
Browse files- README_zh.md +1 -1
README_zh.md
CHANGED
|
@@ -118,7 +118,7 @@ modelscope download --model cialtion/SimpleTool \
|
|
| 118 |
| RT-Qwen2.5-14B-AWQ | 14B | ~130ms | [🤗](https://huggingface.co/Cialtion/SimpleTool/tree/main/RT-Qwen2.5-14B-AWQ) | [链接](https://www.modelscope.cn/models/cialtion/SimpleTool/tree/master/RT-Qwen2.5-14B-AWQ) |
|
| 119 |
| RT-Qwen3-30B-A3B-AWQ | 30B(A3B) | ~ | [🤗](https://huggingface.co/Cialtion/SimpleTool/tree/main/RT-Qwen3-30B_awq_w4a16) | [链接](https://www.modelscope.cn/models/cialtion/SimpleTool/tree/master/RT-Qwen3-30B_awq_w4a16) |
|
| 120 |
|
| 121 |
-
> 延迟数据在 RTX 4090 上使用 vLLM 前缀缓存测得。v2 模型采用改进的提示
|
| 122 |
|
| 123 |
</details>
|
| 124 |
|
|
|
|
| 118 |
| RT-Qwen2.5-14B-AWQ | 14B | ~130ms | [🤗](https://huggingface.co/Cialtion/SimpleTool/tree/main/RT-Qwen2.5-14B-AWQ) | [链接](https://www.modelscope.cn/models/cialtion/SimpleTool/tree/master/RT-Qwen2.5-14B-AWQ) |
|
| 119 |
| RT-Qwen3-30B-A3B-AWQ | 30B(A3B) | ~ | [🤗](https://huggingface.co/Cialtion/SimpleTool/tree/main/RT-Qwen3-30B_awq_w4a16) | [链接](https://www.modelscope.cn/models/cialtion/SimpleTool/tree/master/RT-Qwen3-30B_awq_w4a16) |
|
| 120 |
|
| 121 |
+
> 延迟数据在 RTX 4090 上使用 vLLM 前缀缓存测得。v2 模型采用改进的更干净的提示词;v1 模型使用之前的多头指令头。
|
| 122 |
|
| 123 |
</details>
|
| 124 |
|