Instructions to use zai-org/GLM-4.1V-9B-Thinking with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use zai-org/GLM-4.1V-9B-Thinking with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="zai-org/GLM-4.1V-9B-Thinking")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("zai-org/GLM-4.1V-9B-Thinking")
model = AutoModelForMultimodalLM.from_pretrained("zai-org/GLM-4.1V-9B-Thinking")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use zai-org/GLM-4.1V-9B-Thinking with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "zai-org/GLM-4.1V-9B-Thinking"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "zai-org/GLM-4.1V-9B-Thinking",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/zai-org/GLM-4.1V-9B-Thinking

SGLang

How to use zai-org/GLM-4.1V-9B-Thinking with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "zai-org/GLM-4.1V-9B-Thinking" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "zai-org/GLM-4.1V-9B-Thinking",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "zai-org/GLM-4.1V-9B-Thinking" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "zai-org/GLM-4.1V-9B-Thinking",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use zai-org/GLM-4.1V-9B-Thinking with Docker Model Runner:
```
docker model run hf.co/zai-org/GLM-4.1V-9B-Thinking
```

Is GLM4V officially supported by vLLM?

by ReaperWL19 - opened Jul 2, 2025

Discussion

ReaperWL19

Jul 2, 2025

Hi,
I'm trying to load GLM4V using vLLM, but it crashes with:
TypeError: arange() received an invalid combination of arguments - got (int, Glm4vConfig, int, dtype=torch.dtype)
The issue occurs in:

inv_freq = 1.0 / (theta ** (torch.arange(0, dim, 2, dtype=torch.float) / dim))

Where dim seems to be the full Glm4vConfig instead of an int like config.hidden_size.

Is GLM4V officially supported in vLLM? If not, is there a recommended workaround?

Thanks!

ZHANGYUXUAN-zR

Z.ai org Jul 3, 2025

•

edited Jul 3, 2025

Please use the vLLM source code(main branch) for installation, not any release version

yuchenxie

Jul 3, 2025

I am using vLLM v0.9.1, I am getting implementation errors:

Process EngineCore_0:                                                                                                                                                                                                                                                                                                                                                     
Traceback (most recent call last):                                                                                                                                                                                                                                                                                                                                        
  File "/root/anaconda3/envs/trainer/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap                                                                                                                                                                                                                                                                  
    self.run()                                                                                                                                                                                                                                                                                                                                                            
  File "/root/anaconda3/envs/trainer/lib/python3.11/multiprocessing/process.py", line 108, in run                                                                                                                                                                                                                                                                         
    self._target(*self._args, **self._kwargs)                                                                                                                                                                                                                                                                                                                             
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/v1/engine/core.py", line 519, in run_engine_core                                                                                                                                                                                                                                                   
    raise e                                                                                                                                                                                                                                                                                                                                                               
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/v1/engine/core.py", line 506, in run_engine_core                                                                                                                                                                                                                                                   
    engine_core = EngineCoreProc(*args, **kwargs)                                                                                                                                                                                                                                                                                                                         
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                                         
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/v1/engine/core.py", line 390, in __init__                                                                                                                                                                                                                                                          
    super().__init__(vllm_config, executor_class, log_stats,                                                                                                                                                                                                                                                                                                              
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/v1/engine/core.py", line 76, in __init__                                                                                                                                                                                                                                                           
    self.model_executor = executor_class(vllm_config)                                                                                                                                                                                                                                                                                                                     
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                                     
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/executor/executor_base.py", line 53, in __init__                                                                                                                                                                                                                                                   
    self._init_executor()                                                                                                                                                                                                                                                                                                                                                 
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/executor/uniproc_executor.py", line 48, in _init_executor                                                                                                                                                                                                                                          
    self.collective_rpc("load_model")                                                                                                                                                                                                                                                                                                                                     
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/executor/uniproc_executor.py", line 57, in collective_rpc                                                                                                                                                                                                                                          
    answer = run_method(self.driver_worker, method, args, kwargs)                                                                                                                                                                                                                                                                                                         
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                         
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/utils.py", line 2671, in run_method                                                                                                                                                                                                                                                                
    return func(*args, **kwargs)                                                                                                                                                                                                                                                                                                                                          
           ^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                                                          
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/v1/worker/gpu_worker.py", line 180, in load_model                                                                                                                                                                                                                                                  
    self.model_runner.load_model()                                                                                                                                                                                                                                                                                                                                        
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1601, in load_model                                                                                                                                                                                                                                           
    self.model = model_loader.load_model(                                                                                                                                                                                                                                                                                                                                 
                 ^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                                                 
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/model_executor/model_loader/base_loader.py", line 38, in load_model                                                                                                                                                                                                                                
    model = initialize_model(vllm_config=vllm_config,                                                                                                                                                                                                                                                                                                                     
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                                     
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/model_executor/model_loader/utils.py", line 62, in initialize_model                                                                                                                                                                                                                                
    return model_class(vllm_config=vllm_config, prefix=prefix)                                                                                                                                                                                                                                                                                                            
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                            
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/compilation/decorators.py", line 152, in __init__                                                                                                                                                                                                                                                  
    old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs)                                                                                                                                                                                                                                                                                                      
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/model_executor/models/transformers.py", line 443, in __init__                                                                                                                                                                                                                                      
    self.model = TransformersModel(vllm_config=vllm_config, prefix=prefix)                                                                                                                                                                                                                                                                                                
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/model_executor/models/transformers.py", line 205, in __init__                                                                                                                                                                                                                                      
    self.init_buffers(self.model)                                                                                                                                                                                                                                                                                                                                         
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/model_executor/models/transformers.py", line 353, in init_buffers                                                                                                                                                                                                                                  
    self.init_buffers(child)                                                                                                                                                                                                                                                                                                                                              
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/model_executor/models/transformers.py", line 353, in init_buffers                                                                                                                                                                                                                                  
    self.init_buffers(child)                                                                                                                                                                                                                                                                                                                                              
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/vllm/model_executor/models/transformers.py", line 350, in init_buffers                                                                                                                                                                                                                                  
    new_buffer = getattr(type(module)(self.config), name)                                                                                                                                                                                                                                                                                                                 
                         ^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                                        
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/transformers/models/glm4v/modeling_glm4v.py", line 138, in __init__                                                                                                                                                                                                                                     
    self.image_size = config.image_size                                                                                                                                                                                                                                                                                                                                   
                      ^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                                                   
  File "/root/anaconda3/envs/trainer/lib/python3.11/site-packages/transformers/configuration_utils.py", line 209, in __getattribute__                                                                                                                                                                                                                                     
    return super().__getattribute__(key)                                                                                                                                                                                                                                                                                                                                  
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                                                  
AttributeError: 'Glm4vConfig' object has no attribute 'image_size'

ZHANGYUXUAN-zR

Z.ai org Jul 3, 2025

need build from source, 0.9.2 still not release

tiandaye

Jul 7, 2025

试过vllm下载最新源码编译安装也不行，还是等着0.9.2发布吧

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment