Update README.md
Browse files
README.md
CHANGED
|
@@ -10,7 +10,7 @@ base_model:
|
|
| 10 |
base_model_relation: quantized
|
| 11 |
---
|
| 12 |
# Step3-VL-10B-AWQ
|
| 13 |
-
Base model: [stepfun-ai/Step3-VL-10B](https://
|
| 14 |
|
| 15 |
I added a small injection at the end of the original `chat_template.jinja` to support a GLM-like switch: you can try to disable the “thinking/reasoning” mode in your chat completion request via `"chat_template_kwargs": {"enable_thinking": False}`.
|
| 16 |
|
|
@@ -66,8 +66,8 @@ vllm serve \
|
|
| 66 |
|
| 67 |
### 【Model Download】
|
| 68 |
```python
|
| 69 |
-
from
|
| 70 |
-
snapshot_download('
|
| 71 |
```
|
| 72 |
|
| 73 |
### 【Overview】
|
|
|
|
| 10 |
base_model_relation: quantized
|
| 11 |
---
|
| 12 |
# Step3-VL-10B-AWQ
|
| 13 |
+
Base model: [stepfun-ai/Step3-VL-10B](https://huggingface.co/stepfun-ai/Step3-VL-10B)
|
| 14 |
|
| 15 |
I added a small injection at the end of the original `chat_template.jinja` to support a GLM-like switch: you can try to disable the “thinking/reasoning” mode in your chat completion request via `"chat_template_kwargs": {"enable_thinking": False}`.
|
| 16 |
|
|
|
|
| 66 |
|
| 67 |
### 【Model Download】
|
| 68 |
```python
|
| 69 |
+
from huggingface_hub import snapshot_download
|
| 70 |
+
snapshot_download('QuantTrio/Step3-VL-10B-AWQ', cache_dir="your_local_path")
|
| 71 |
```
|
| 72 |
|
| 73 |
### 【Overview】
|