Update README.md
Browse files
README.md
CHANGED
|
@@ -21,7 +21,7 @@ Base model: [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B)
|
|
| 21 |
<i>Compared to earlier quantized versions, the new quantized model demonstrates better tokens/s efficiency. This improvement comes from setting desc_act=False in the quantization configuration.</i>
|
| 22 |
|
| 23 |
```
|
| 24 |
-
vllm serve JunHowie/Qwen3-8B-GPTQ-Int8
|
| 25 |
```
|
| 26 |
|
| 27 |
### 【Dependencies】
|
|
|
|
| 21 |
<i>Compared to earlier quantized versions, the new quantized model demonstrates better tokens/s efficiency. This improvement comes from setting desc_act=False in the quantization configuration.</i>
|
| 22 |
|
| 23 |
```
|
| 24 |
+
vllm serve JunHowie/Qwen3-8B-GPTQ-Int8
|
| 25 |
```
|
| 26 |
|
| 27 |
### 【Dependencies】
|