Intel
/

GLM-5-int4-mixed-AutoRound

text-generation-inference

4-bit precision

Model card Files Files and versions

wenhuach commited on Feb 25

Commit

1de00d7

·

verified ·

1 Parent(s): fc62d05

Update README.md

Files changed (1) hide show

README.md +6 -4

README.md CHANGED Viewed

@@ -8,15 +8,17 @@ tags:
 This model is a mixed int4 model with group_size 128 and asymmetric quantization of [zai-org/GLM-5](https://huggingface.co/zai-org/GLM-5/) generated by [intel/auto-round](https://github.com/intel/auto-round). Please follow the license of the original model.
-### vllm inference
- pip install git+https://github.com/vllm-project/vllm.git@main
-```
 pip install git+https://github.com/huggingface/transformers.git
-```

 This model is a mixed int4 model with group_size 128 and asymmetric quantization of [zai-org/GLM-5](https://huggingface.co/zai-org/GLM-5/) generated by [intel/auto-round](https://github.com/intel/auto-round). Please follow the license of the original model.
+**The model is quantized with pure RTN mode**
+### vllm inference
+**Setup**
+~~~bash
+pip install git+https://github.com/vllm-project/vllm.git@main
 pip install git+https://github.com/huggingface/transformers.git
+~~~bash