QuantTrio
/

GLM-5-AWQ

Text Generation

4-bit precision

Model card Files Files and versions

JunHowie commited on 10 days ago

Commit

a838c55

·

verified ·

1 Parent(s): 98c1be3

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -6,12 +6,12 @@ tags:
 - vLLM
 - AWQ
 base_model:
-  - ZhipuAI/GLM-5
 base_model_relation: quantized
 ---
 # GLM-5-AWQ
-Base model: [ZhipuAI/GLM-5](https://www.modelscope.cn/models/ZhipuAI/GLM-5)
 This repo quantizes the model using data-free quantization (no calibration dataset required).
@@ -47,7 +47,7 @@ export VLLM_USE_FLASHINFER_SAMPLER=0
 export OMP_NUM_THREADS=4
 vllm serve \
-    __YOUR_PATH__/tclf90/GLM-5-AWQ \
     --served-model-name MY_MODEL \
     --swap-space 16 \
     --max-num-seqs 32 \
@@ -77,8 +77,8 @@ vllm serve \
 ### 【Model Download】
 ```python
-from modelscope import snapshot_download
-snapshot_download('tclf90/GLM-5-AWQ', cache_dir="your_local_path")
 ```
 ### 【Overview】

 - vLLM
 - AWQ
 base_model:
+  - zai-org/GLM-5
 base_model_relation: quantized
 ---
 # GLM-5-AWQ
+Base model: [zai-org/GLM-5](https://huggingface.co/zai-org/GLM-5)
 This repo quantizes the model using data-free quantization (no calibration dataset required).
 export OMP_NUM_THREADS=4
 vllm serve \
+    __YOUR_PATH__/QuantTrio/GLM-5-AWQ \
     --served-model-name MY_MODEL \
     --swap-space 16 \
     --max-num-seqs 32 \
 ### 【Model Download】
 ```python
+from huggingface_hub import snapshot_download
+snapshot_download('QuantTrio/GLM-5-AWQ', cache_dir="your_local_path")
 ```
 ### 【Overview】