Update README.md
Browse files
README.md
CHANGED
|
@@ -17,7 +17,8 @@ base_model_relation: quantized
|
|
| 17 |
Base model: [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)
|
| 18 |
|
| 19 |
<i>This model is quantized to 4-bit with a group size of 128.</i>
|
| 20 |
-
<
|
|
|
|
| 21 |
|
| 22 |
```
|
| 23 |
vllm serve JunHowie/Qwen3-0.6B-GPTQ-Int4
|
|
|
|
| 17 |
Base model: [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)
|
| 18 |
|
| 19 |
<i>This model is quantized to 4-bit with a group size of 128.</i>
|
| 20 |
+
<br>
|
| 21 |
+
<i>Compared to earlier quantized versions, the new quantized model demonstrates better tokens/s efficiency. This improvement comes from setting desc_act=False in the quantization configuration.</i>
|
| 22 |
|
| 23 |
```
|
| 24 |
vllm serve JunHowie/Qwen3-0.6B-GPTQ-Int4
|