Update README.md
Browse files
README.md
CHANGED
|
@@ -16,6 +16,7 @@ pipeline_tag: text-generation
|
|
| 16 |
</p>
|
| 17 |
|
| 18 |
语言 [中文](https://huggingface.co/YCWTG/Qwen3-Coder-Next-int2-mixed-AutoRound/blob/main/README_zh.md)|English
|
|
|
|
| 19 |
## Model Details
|
| 20 |
|
| 21 |
This model is an **mixed-bits INT2 quantized** model with group_size 512 and symmetric quantization of [Qwen/Qwen3-Coder-Next](https://huggingface.co/Qwen/Qwen3-Coder-Next) generated by [intel/auto-round](https://github.com/intel/auto-round). Please follow the license of the original model.
|
|
@@ -30,6 +31,7 @@ This model is an **mixed-bits INT2 quantized** model with group_size 512 and sym
|
|
| 30 |
| lm_head | Original | Excluded by AutoRound |
|
| 31 |
|
| 32 |
### Model Size
|
|
|
|
| 33 |
- **Original BF16**: ~160GB
|
| 34 |
- **mixed INT2**: ~25GB (**84%↓↓**)
|
| 35 |
|
|
@@ -204,7 +206,6 @@ def chat_loop(model, tokenizer):
|
|
| 204 |
if __name__ == "__main__":
|
| 205 |
model, tokenizer = load_model()
|
| 206 |
chat_loop(model, tokenizer)
|
| 207 |
-
|
| 208 |
```
|
| 209 |
|
| 210 |
|
|
@@ -249,7 +250,6 @@ autoround = AutoRound(
|
|
| 249 |
)
|
| 250 |
output_dir="~/.cache/model/Qwen3-Coder-Next-int2-mixed-AutoRound"
|
| 251 |
autoround.quantize_and_save(output_dir,format="auto_round" )
|
| 252 |
-
|
| 253 |
```
|
| 254 |
|
| 255 |
## Ethical Considerations and Limitations
|
|
|
|
| 16 |
</p>
|
| 17 |
|
| 18 |
语言 [中文](https://huggingface.co/YCWTG/Qwen3-Coder-Next-int2-mixed-AutoRound/blob/main/README_zh.md)|English
|
| 19 |
+
|
| 20 |
## Model Details
|
| 21 |
|
| 22 |
This model is an **mixed-bits INT2 quantized** model with group_size 512 and symmetric quantization of [Qwen/Qwen3-Coder-Next](https://huggingface.co/Qwen/Qwen3-Coder-Next) generated by [intel/auto-round](https://github.com/intel/auto-round). Please follow the license of the original model.
|
|
|
|
| 31 |
| lm_head | Original | Excluded by AutoRound |
|
| 32 |
|
| 33 |
### Model Size
|
| 34 |
+
|
| 35 |
- **Original BF16**: ~160GB
|
| 36 |
- **mixed INT2**: ~25GB (**84%↓↓**)
|
| 37 |
|
|
|
|
| 206 |
if __name__ == "__main__":
|
| 207 |
model, tokenizer = load_model()
|
| 208 |
chat_loop(model, tokenizer)
|
|
|
|
| 209 |
```
|
| 210 |
|
| 211 |
|
|
|
|
| 250 |
)
|
| 251 |
output_dir="~/.cache/model/Qwen3-Coder-Next-int2-mixed-AutoRound"
|
| 252 |
autoround.quantize_and_save(output_dir,format="auto_round" )
|
|
|
|
| 253 |
```
|
| 254 |
|
| 255 |
## Ethical Considerations and Limitations
|