Instructions to use google/gemma-4-31B-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use google/gemma-4-31B-it with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="google/gemma-4-31B-it")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("google/gemma-4-31B-it")
model = AutoModelForImageTextToText.from_pretrained("google/gemma-4-31B-it")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
HuggingChat
Notebooks
Google Colab
Kaggle
AMD Developer Cloud
Local Apps

vLLM

How to use google/gemma-4-31B-it with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "google/gemma-4-31B-it"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-4-31B-it",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/google/gemma-4-31B-it

SGLang

How to use google/gemma-4-31B-it with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "google/gemma-4-31B-it" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-4-31B-it",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "google/gemma-4-31B-it" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-4-31B-it",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use google/gemma-4-31B-it with Docker Model Runner:
```
docker model run hf.co/google/gemma-4-31B-it
```

Jobayer

#85

by Jy0018 - opened 29 days ago

base: refs/heads/main

←

from: refs/pr/85

Discussion Files changed

+11

-37

This PR is in draft mode

Files changed (3) hide show

.eval_results/mmmu_pro.yaml +0 -8
README.md +1 -3
chat_template.jinja +10 -26

.eval_results/mmmu_pro.yaml DELETED Viewed

@@ -1,8 +0,0 @@
-- dataset:
-    id: MMMU/MMMU_Pro
-    task_id: mmmu_pro_vision
-  value: 76.9
-  date: '2026-05-12'
-  source:
-    url: https://huggingface.co/google/gemma-4-31B-it
-    name: Model Card

README.md CHANGED Viewed

@@ -3,8 +3,6 @@ library_name: transformers
 license: apache-2.0
 license_link: https://ai.google.dev/gemma/docs/gemma_4_license
 pipeline_tag: image-text-to-text
-base_model:
-- google/gemma-4-31B
 ---
 <div align="center">
@@ -512,4 +510,4 @@ The development of vision-language models (VLMs) raises several ethical concerns
 ### **Benefits**
-At the time of release, this family of models provides high-performance open vision-language model implementations designed from the ground up for responsible AI development compared to similarly sized models.

 license: apache-2.0
 license_link: https://ai.google.dev/gemma/docs/gemma_4_license
 pipeline_tag: image-text-to-text
 ---
 <div align="center">
 ### **Benefits**
+At the time of release, this family of models provides high-performance open vision-language model implementations designed from the ground up for responsible AI development compared to similarly sized models.

chat_template.jinja CHANGED Viewed

@@ -1,9 +1,9 @@
-{%- macro format_parameters(properties, required, filter_keys=false) -%}
     {%- set standard_keys = ['description', 'type', 'properties', 'required', 'nullable'] -%}
     {%- set ns = namespace(found_first=false) -%}
     {%- for key, value in properties | dictsort -%}
         {%- set add_comma = false -%}
-        {%- if not filter_keys or key not in standard_keys -%}
             {%- if ns.found_first %},{% endif -%}
             {%- set ns.found_first = true -%}
             {{ key }}:{
@@ -65,7 +65,7 @@
                 {%- elif value is mapping -%}
                     {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
                     properties:{
-                    {{- format_parameters(value, value['required'] | default([]), filter_keys=true) -}}
                     }
                 {%- endif -%}
                 {%- if value['required'] -%}
@@ -178,21 +178,18 @@
 {#- Handle System/Tool Definitions Block -#}
 {%- if (enable_thinking is defined and enable_thinking) or tools or messages[0]['role'] in ['system', 'developer'] -%}
     {{- '<|turn>system\n' -}}
     {#- Inject Thinking token at the very top of the FIRST system turn -#}
     {%- if enable_thinking is defined and enable_thinking -%}
         {{- '<|think|>\n' -}}
         {%- set ns.prev_message_type = 'think' -%}
     {%- endif -%}
     {%- if messages[0]['role'] in ['system', 'developer'] -%}
-        {%- if messages[0]['content'] is string -%}
-            {{- messages[0]['content'] | trim -}}
-        {%- elif messages[0]['content'] is sequence -%}
-            {%- for item in messages[0]['content'] -%}
-                {{- item['text'] | trim + ' '-}}
-            {%- endfor -%}
-        {%- endif -%}
         {%- set loop_messages = messages[1:] -%}
     {%- endif -%}
     {%- if tools -%}
         {%- for tool in tools %}
             {{- '<|tool>' -}}
@@ -201,6 +198,7 @@
         {%- endfor %}
         {%- set ns.prev_message_type = 'tool' -%}
     {%- endif -%}
     {{- '<turn|>\n' -}}
 {%- endif %}
@@ -295,15 +293,6 @@
                                 {%- endif -%}
                             {%- endfor -%}
                             {{- format_tool_response_block(ns_tname.name, ns_txt.s) -}}
-                            {%- for part in tool_body -%}
-                                {%- if part.get('type') == 'image' -%}
-                                    {{- '<|image|>' -}}
-                                {%- elif part.get('type') == 'audio' -%}
-                                    {{- '<|audio|>' -}}
-                                {%- elif part.get('type') == 'video' -%}
-                                    {{- '<|video|>' -}}
-                                {%- endif -%}
-                            {%- endfor -%}
                         {%- else -%}
                             {{- format_tool_response_block(ns_tname.name, tool_body) -}}
                         {%- endif -%}
@@ -313,7 +302,6 @@
                 {%- endfor -%}
             {%- endif -%}
-            {%- set captured_content -%}
             {%- if message['content'] is string -%}
                 {%- if role == 'model' -%}
                     {{- strip_thinking(message['content']) -}}
@@ -340,14 +328,10 @@
                     {%- endif -%}
                 {%- endfor -%}
             {%- endif -%}
-            {%- endset -%}
-            {{- captured_content -}}
-            {%- set has_content = captured_content | trim | length > 0 -%}
         {%- if ns.prev_message_type == 'tool_call' and not ns_tr_out.flag -%}
             {{- '<|tool_response>' -}}
-        {%- elif not (ns_tr_out.flag and not has_content) -%}
             {{- '<turn|>\n' -}}
         {%- endif -%}
     {%- endif -%}
@@ -360,4 +344,4 @@
             {{- '<|channel>thought\n<channel|>' -}}
         {%- endif -%}
     {%- endif -%}
-{%- endif -%}

+{%- macro format_parameters(properties, required) -%}
     {%- set standard_keys = ['description', 'type', 'properties', 'required', 'nullable'] -%}
     {%- set ns = namespace(found_first=false) -%}
     {%- for key, value in properties | dictsort -%}
         {%- set add_comma = false -%}
+        {%- if key not in standard_keys -%}
             {%- if ns.found_first %},{% endif -%}
             {%- set ns.found_first = true -%}
             {{ key }}:{
                 {%- elif value is mapping -%}
                     {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
                     properties:{
+                    {{- format_parameters(value, value['required'] | default([])) -}}
                     }
                 {%- endif -%}
                 {%- if value['required'] -%}
 {#- Handle System/Tool Definitions Block -#}
 {%- if (enable_thinking is defined and enable_thinking) or tools or messages[0]['role'] in ['system', 'developer'] -%}
     {{- '<|turn>system\n' -}}
     {#- Inject Thinking token at the very top of the FIRST system turn -#}
     {%- if enable_thinking is defined and enable_thinking -%}
         {{- '<|think|>\n' -}}
         {%- set ns.prev_message_type = 'think' -%}
     {%- endif -%}
     {%- if messages[0]['role'] in ['system', 'developer'] -%}
+        {{- messages[0]['content'] | trim -}}
         {%- set loop_messages = messages[1:] -%}
     {%- endif -%}
     {%- if tools -%}
         {%- for tool in tools %}
             {{- '<|tool>' -}}
         {%- endfor %}
         {%- set ns.prev_message_type = 'tool' -%}
     {%- endif -%}
     {{- '<turn|>\n' -}}
 {%- endif %}
                                 {%- endif -%}
                             {%- endfor -%}
                             {{- format_tool_response_block(ns_tname.name, ns_txt.s) -}}
                         {%- else -%}
                             {{- format_tool_response_block(ns_tname.name, tool_body) -}}
                         {%- endif -%}
                 {%- endfor -%}
             {%- endif -%}
             {%- if message['content'] is string -%}
                 {%- if role == 'model' -%}
                     {{- strip_thinking(message['content']) -}}
                     {%- endif -%}
                 {%- endfor -%}
             {%- endif -%}
         {%- if ns.prev_message_type == 'tool_call' and not ns_tr_out.flag -%}
             {{- '<|tool_response>' -}}
+        {%- elif not (ns_tr_out.flag and not message.get('content')) -%}
             {{- '<turn|>\n' -}}
         {%- endif -%}
     {%- endif -%}
             {{- '<|channel>thought\n<channel|>' -}}
         {%- endif -%}
     {%- endif -%}
+{%- endif -%}