Instructions to use zai-org/GLM-5.2-FP8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use zai-org/GLM-5.2-FP8 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="zai-org/GLM-5.2-FP8")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("zai-org/GLM-5.2-FP8")
model = AutoModelForCausalLM.from_pretrained("zai-org/GLM-5.2-FP8", device_map="auto")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
HuggingChat
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use zai-org/GLM-5.2-FP8 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "zai-org/GLM-5.2-FP8"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "zai-org/GLM-5.2-FP8",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/zai-org/GLM-5.2-FP8

SGLang

How to use zai-org/GLM-5.2-FP8 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "zai-org/GLM-5.2-FP8" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "zai-org/GLM-5.2-FP8",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "zai-org/GLM-5.2-FP8" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "zai-org/GLM-5.2-FP8",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use zai-org/GLM-5.2-FP8 with Docker Model Runner:
```
docker model run hf.co/zai-org/GLM-5.2-FP8
```

GLM-5.2-FP8

Commit History

add moe_router_dtype config

ba978f7

zRzRzRzRzRzRzR commited on Jul 2

add Footnote

70311cf

zRzRzRzRzRzRzR commited on Jun 23

Merge branch 'main' of hf.co:zai-org/GLM-5.2-FP8

31cba24

zRzRzRzRzRzRzR commited on Jun 19

add unsloth

73d6d18

zRzRzRzRzRzRzR commited on Jun 19

Update README.md

a0b55e8
verified

davidlvxin commited on Jun 17

update with Ascend support

3722e20

zRzRzRzRzRzRzR commited on Jun 17

update with Ascend support

0e60b39

zRzRzRzRzRzRzR commited on Jun 17

update readme

7d81a3f

zRzRzRzRzRzRzR commited on Jun 16

update readme

f51f869

zRzRzRzRzRzRzR commited on Jun 16

Add files using upload-large-folder tool

fe9b63b
verified

ZHANGYUXUAN-zR commited on Jun 16

init readme

e7c4515

zRzRzRzRzRzRzR commited on Jun 16

Delete .tmpUMarya

974885c
verified

ZHANGYUXUAN-zR commited on Jun 16

Add files using upload-large-folder tool

587d728
verified

ZHANGYUXUAN-zR commited on Jun 16

Add files using upload-large-folder tool

cbfc9d2
verified

ZHANGYUXUAN-zR commited on Jun 16

Add files using upload-large-folder tool

9d36295
verified

ZHANGYUXUAN-zR commited on Jun 16

initial commit

e9ac0db
verified

ZHANGYUXUAN-zR commited on Jun 16

Commit History

add moe_router_dtype config ba978f7

add Footnote 70311cf

Merge branch 'main' of hf.co:zai-org/GLM-5.2-FP8 31cba24

add unsloth 73d6d18

Update README.md a0b55e8 verified

update with Ascend support 3722e20

update with Ascend support 0e60b39

update readme 7d81a3f

update readme f51f869

Add files using upload-large-folder tool fe9b63b verified

init readme e7c4515

Delete .tmpUMarya 974885c verified

Add files using upload-large-folder tool 587d728 verified

Add files using upload-large-folder tool cbfc9d2 verified

Add files using upload-large-folder tool 9d36295 verified

initial commit e9ac0db verified

add moe_router_dtype config

ba978f7

add Footnote

70311cf

Merge branch 'main' of hf.co:zai-org/GLM-5.2-FP8

31cba24

add unsloth

73d6d18

Update README.md

a0b55e8
verified

update with Ascend support

3722e20

update with Ascend support

0e60b39

update readme

7d81a3f

update readme

f51f869

Add files using upload-large-folder tool

fe9b63b
verified

init readme

e7c4515

Delete .tmpUMarya

974885c
verified

Add files using upload-large-folder tool

587d728
verified

Add files using upload-large-folder tool

cbfc9d2
verified

Add files using upload-large-folder tool

9d36295
verified

initial commit

e9ac0db
verified