Text Generation
Transformers
Safetensors
glm_moe_dsa
vLLM
compressed-tensors
INT4
INT8
W4A16
W8A16
conversational
Instructions to use QuantTrio/GLM-5.2-Int4-Int8Mix with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use QuantTrio/GLM-5.2-Int4-Int8Mix with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="QuantTrio/GLM-5.2-Int4-Int8Mix") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("QuantTrio/GLM-5.2-Int4-Int8Mix") model = AutoModelForCausalLM.from_pretrained("QuantTrio/GLM-5.2-Int4-Int8Mix") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use QuantTrio/GLM-5.2-Int4-Int8Mix with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "QuantTrio/GLM-5.2-Int4-Int8Mix" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "QuantTrio/GLM-5.2-Int4-Int8Mix", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/QuantTrio/GLM-5.2-Int4-Int8Mix
- SGLang
How to use QuantTrio/GLM-5.2-Int4-Int8Mix with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "QuantTrio/GLM-5.2-Int4-Int8Mix" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "QuantTrio/GLM-5.2-Int4-Int8Mix", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "QuantTrio/GLM-5.2-Int4-Int8Mix" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "QuantTrio/GLM-5.2-Int4-Int8Mix", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use QuantTrio/GLM-5.2-Int4-Int8Mix with Docker Model Runner:
docker model run hf.co/QuantTrio/GLM-5.2-Int4-Int8Mix
Add files using upload-large-folder tool
Browse files- .gitattributes +1 -0
- model-00001-of-00124.safetensors +3 -0
- model-00006-of-00124.safetensors +3 -0
- model-00007-of-00124.safetensors +3 -0
- model-00009-of-00124.safetensors +3 -0
- model-00010-of-00124.safetensors +3 -0
- model-00011-of-00124.safetensors +3 -0
- model-00012-of-00124.safetensors +3 -0
- model-00013-of-00124.safetensors +3 -0
- model-00014-of-00124.safetensors +3 -0
- model-00016-of-00124.safetensors +3 -0
- model-00017-of-00124.safetensors +3 -0
- model-00018-of-00124.safetensors +3 -0
- model-00020-of-00124.safetensors +3 -0
- model-00021-of-00124.safetensors +3 -0
- model-00023-of-00124.safetensors +3 -0
- model-00038-of-00124.safetensors +3 -0
- model-00040-of-00124.safetensors +3 -0
- model-00042-of-00124.safetensors +3 -0
- model-00043-of-00124.safetensors +3 -0
- model-00044-of-00124.safetensors +3 -0
- model-00045-of-00124.safetensors +3 -0
- model-00046-of-00124.safetensors +3 -0
- model-00058-of-00124.safetensors +3 -0
- model-00068-of-00124.safetensors +3 -0
- model-00070-of-00124.safetensors +3 -0
- model-00071-of-00124.safetensors +3 -0
- model-00073-of-00124.safetensors +3 -0
- model-00074-of-00124.safetensors +3 -0
- model-00082-of-00124.safetensors +3 -0
- model-00085-of-00124.safetensors +3 -0
- model-00086-of-00124.safetensors +3 -0
- model-00099-of-00124.safetensors +3 -0
- model-00108-of-00124.safetensors +3 -0
- model-00109-of-00124.safetensors +3 -0
- model-00113-of-00124.safetensors +3 -0
- model-00116-of-00124.safetensors +3 -0
- model-00120-of-00124.safetensors +3 -0
- model-00123-of-00124.safetensors +3 -0
- model-00124-of-00124.safetensors +3 -0
- model.safetensors.index.json +3 -0
- mtp-00003-of-00004.safetensors +3 -0
.gitattributes
CHANGED
|
@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
model.safetensors.index.json filter=lfs diff=lfs merge=lfs -text
|
model-00001-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bd26ce41a46adc0d601ad37961fd7ef5aa2c7028a5f590931e0fdd1b7dbb6ac7
|
| 3 |
+
size 1903165568
|
model-00006-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:922b29787aa56609b3a374764cbb9b951578b6fdea69f9b0f6d802271e916984
|
| 3 |
+
size 3216415536
|
model-00007-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0622bd62d84c2010df1b89417491863e7d25b4297ed2b6a37091c172776fa691
|
| 3 |
+
size 3216414984
|
model-00009-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:23595766a8d6de10ac8fe6d955c9ede054658b3c525b00b1232567bd98c39bd2
|
| 3 |
+
size 3216415160
|
model-00010-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1c0274c8f655a73ea1af33d577156250f35b106f016739dce57bbbdfa40b4ddb
|
| 3 |
+
size 3212975680
|
model-00011-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1d3b31bc34f7a1d86479b9362d111278084358dddd9cf86579394cb091fccee9
|
| 3 |
+
size 3220990504
|
model-00012-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ce50c5f4289141f1b98800bb82a714f2f0475ac0b276e8c5a5921afd0b7797e4
|
| 3 |
+
size 3216414984
|
model-00013-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f6c0f198ed477a4310bd36cf87809c974eb24c1591d7c3677e1acf606d7c8f4d
|
| 3 |
+
size 3218271688
|
model-00014-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2baecf730f2cf00c1de5b12af924321d0c519d490df52a05ec78b061a0d6ee47
|
| 3 |
+
size 3216415408
|
model-00016-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e4c57ebef2c29bd4b6cb05c8c33e96bbd73116f0f8bc3a0c0fef8be77907e8d4
|
| 3 |
+
size 3218272024
|
model-00017-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c6347fd3c9a05039bb49ec477d3b34e523d21ecde2ee880290a2a47cd00cbc60
|
| 3 |
+
size 3215694112
|
model-00018-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f053aba3de63a95eacda3f6152ee4728b25c61d7b94b242e33cf7c592b6a34a1
|
| 3 |
+
size 3218271464
|
model-00020-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d32bd29d87e1574ac00b656ecb9c561f83e4b58fa116705e97dce95fb11809bc
|
| 3 |
+
size 3216414984
|
model-00021-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fadab5e897eb73ddcacc4a4b7cffb91676d479edf8f67708b51f606276890399
|
| 3 |
+
size 3218271992
|
model-00023-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d5d710aea5b8d924b456526c6cf7ff2e4b633bce8432e7b94bc46119a1e93c81
|
| 3 |
+
size 3218271424
|
model-00038-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9ef900d776fc0773ebb54af9a6514994cd39db79ec87e23372b29d0726f3041a
|
| 3 |
+
size 3215694048
|
model-00040-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:453423de73d13a9fc6d26b345633fbed8e41f456913d0eb98de17cc562b4c663
|
| 3 |
+
size 3216415424
|
model-00042-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:162b0fc3bb6996f4a6335def564aabb7db87629182f29926b11b779cf5b879fc
|
| 3 |
+
size 3218272024
|
model-00043-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b239cbdacc80b5be9f0ac7c7c7122cc68ccd9f64e0e7884aa3b260601ccc38ce
|
| 3 |
+
size 3216415048
|
model-00044-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c905c360996e1900062a020b673d7f68bca70fab4c9309aa8a2f49062386cc69
|
| 3 |
+
size 3218271440
|
model-00045-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:625969e5f88093587eb01a82219d1bc9a0e7631af571a13866950aad9cd216a1
|
| 3 |
+
size 3215694720
|
model-00046-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c1fa69af8e77b8c2a73977abebeada6ceabeb014a72828911d1ecc22aff80c05
|
| 3 |
+
size 3216414984
|
model-00058-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c25b7c5f23f639ab3f0a45cbbc58720ee687597940fc89648a4b2577587d5492
|
| 3 |
+
size 3216808504
|
model-00068-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8c9cb6d7e0c40af460807f82f929f7c08a50c1235caecf3e4b243aff3f1007dd
|
| 3 |
+
size 3218271824
|
model-00070-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c79686bb277b43a4073f74f90e6b4de3de6a14dc027fea0915a2b5efc3e10f31
|
| 3 |
+
size 3216414976
|
model-00071-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d84986d725ccf18aff6e4996ab2bde3d412b9be981c23f42e85985939687ab96
|
| 3 |
+
size 3218270584
|
model-00073-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bc515844a7208c114c3d077c20e24e9dc1f7893b4865197a663bd976ea566071
|
| 3 |
+
size 3218271592
|
model-00074-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:418324c5c43c506a3a731b2bfd7296e9492ed80cf063a212772c9315a3601d4d
|
| 3 |
+
size 3215694576
|
model-00082-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c674a1ebd564d236023bb9216ffde195b0b6676bcd34c2319ada9de9ad0b2ff5
|
| 3 |
+
size 3216415384
|
model-00085-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0043a859662528c2c8a369e32ac90499f745581357fde0b4db2109fda0d22073
|
| 3 |
+
size 3216415008
|
model-00086-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:32b878c8a5c9c5be56bddc59cb25f00211215048910268664ba933b0be2aaa1d
|
| 3 |
+
size 3218271488
|
model-00099-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cbe399fc941d8b9aa021b1924c808cb36569d8f523b0677dd19c05472adfcb2f
|
| 3 |
+
size 3218271392
|
model-00108-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cab784431aa161e52d69b795c421d82ace34d8ca4b184ce35d7a9a23867726bf
|
| 3 |
+
size 3216415392
|
model-00109-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8778256422d88ab1aaf8b9f2399ef32f88e36ba85174db51c2df4933920b59a1
|
| 3 |
+
size 3215694056
|
model-00113-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fe0270a25af0f472d6d10b53b7d0845f40d002aa1e54ed67b34fd0c9b760034a
|
| 3 |
+
size 3218149832
|
model-00116-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7b322366748c17f54b61c211c64ac64292d109583e41c13d82b5d942c47f4e53
|
| 3 |
+
size 3215694496
|
model-00120-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1396b9e17ab3afdd4dbceb7a2f018b9a6cf051ab67eb84975f732eddee8b0e13
|
| 3 |
+
size 3218271440
|
model-00123-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:567f992389679120d1cea62eae1b68f6b010b15d13b0b12508685eb4876a44b5
|
| 3 |
+
size 3218270296
|
model-00124-of-00124.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4ca485c01f0b86794a53fb54c5d61b42e3145ecb549ef81d0bc817c38bdf19ab
|
| 3 |
+
size 1146598432
|
model.safetensors.index.json
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:43298345833417b1ad2a8b76d012a83d4f2275d532e5ab38e118566f1ac7b12b
|
| 3 |
+
size 17236698
|
mtp-00003-of-00004.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f508c90c88dc7f94c13464f8f10cb3dde47e44f498ca0c11b4e393c6482de1cc
|
| 3 |
+
size 3210481440
|