JunHowie commited on
Commit
b9d8700
·
verified ·
1 Parent(s): af0b13a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -17,7 +17,8 @@ base_model_relation: quantized
17
  Base model: [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)
18
 
19
  <i>This model is quantized to 4-bit with a group size of 128.</i>
20
- <i>Compared to earlier quantized versions, the new quantized model achieves better performance in tokens/s efficiency.</i>
 
21
 
22
  ```
23
  vllm serve JunHowie/Qwen3-0.6B-GPTQ-Int4
 
17
  Base model: [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)
18
 
19
  <i>This model is quantized to 4-bit with a group size of 128.</i>
20
+ <br>
21
+ <i>Compared to earlier quantized versions, the new quantized model demonstrates better tokens/s efficiency. This improvement comes from setting desc_act=False in the quantization configuration.</i>
22
 
23
  ```
24
  vllm serve JunHowie/Qwen3-0.6B-GPTQ-Int4