Instructions to use google/gemma-4-31B-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/gemma-4-31B-it with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="google/gemma-4-31B-it") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("google/gemma-4-31B-it") model = AutoModelForImageTextToText.from_pretrained("google/gemma-4-31B-it") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- HuggingChat
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use google/gemma-4-31B-it with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "google/gemma-4-31B-it" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-4-31B-it", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/google/gemma-4-31B-it
- SGLang
How to use google/gemma-4-31B-it with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "google/gemma-4-31B-it" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-4-31B-it", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "google/gemma-4-31B-it" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-4-31B-it", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use google/gemma-4-31B-it with Docker Model Runner:
docker model run hf.co/google/gemma-4-31B-it
Possible chat_template.jinja issue: nullable $ref tool schemas are rendered as empty types
I think there may be a bug in the Gemma chat template's tool declaration rendering.
I tested this through llama.cpp/llama-server with unsloth/gemma-4-31B-it-GGUF, but the issue looks like it comes from the model chat template itself rather than from llama.cpp specifically.
The short version: tool parameters using the common JSON Schema pattern anyOf: [$ref, null] are rendered into the prompt as empty type fields. This strips the useful schema information before the model sees it.
Example schema shape
Many typed tool systems generate schemas like this for nullable structured fields:
{
"type": "object",
"properties": {
"some_assertion_field": {
"anyOf": [
{"$ref": "#/$defs/FeatureAssertion"},
{"type": "null"}
],
"default": null
}
},
"$defs": {
"FeatureAssertion": {
"type": "object",
"additionalProperties": false,
"properties": {
"value": {
"type": "string",
"enum": ["positive", "negative", "not_mentioned", "not_assessable"]
},
"ref_ids": {
"type": "array",
"items": {"type": "string"}
}
},
"required": ["value"]
}
}
}
The expected tool argument is a nested object:
{
"some_assertion_field": {
"value": "positive",
"ref_ids": ["s001"]
}
}
Actual rendered declaration
In the rendered Gemma tool declaration, the field becomes:
some_assertion_field:{type:<|"|><|"|>}
So the model sees the field name, but not:
anyOf$ref- the
FeatureAssertiondefinition - the required
valuefield - the
ref_idsfield - the enum values
Observed behavior
With that rendered declaration, the model called the tool with primitive or invented values:
some_assertion_field:true
or after validation feedback:
some_assertion_field:{assertion:true}
This makes sense if the model was never shown the nested object schema.
What seems to cause it
The current/default chat_template.jinja has a format_parameters macro that appears to assume a property has a direct type:
{%- if value['type'] | upper == 'STRING' -%}
...
{%- elif value['type'] | upper == 'ARRAY' -%}
...
{%- endif -%}
...
type:<|"|>{{ value['type'] | upper }}<|"|>
But for this common schema pattern:
{
"anyOf": [
{"$ref": "#/$defs/FeatureAssertion"},
{"type": "null"}
],
"default": null
}
there is no property-level type. The type information is inside anyOf, and the object shape is in $defs.
The template also does not appear to emit root $defs in format_function_declaration, so even if a field references #/$defs/FeatureAssertion, the definition is not included in the rendered tool declaration.
I also noticed the current template's format_argument does not special-case null/None values, so default: null can be rendered as a blank value in some contexts.
Workaround tested
I made a small local patch to the Gemma Jinja template while keeping the native Gemma format.
The patch:
- renders
anyOfwhen present - renders
$refwhen present - emits root
$defs - only emits
type:when a directtypeexists - renders null/None as
null
After this, the rendered declaration preserved the schema:
some_assertion_field:{
anyOf:[
{$ref:<|"|>#/$defs/FeatureAssertion<|"|>},
{type:<|"|>null<|"|>}
]
}
defs:{
FeatureAssertion:{
properties:{
value:{
enum:[
<|"|>positive<|"|>,
<|"|>negative<|"|>,
<|"|>not_mentioned<|"|>,
<|"|>not_assessable<|"|>
],
type:<|"|>string<|"|>
},
ref_ids:{
items:{type:<|"|>string<|"|>},
type:<|"|>array<|"|>
}
},
required:[<|"|>value<|"|>],
type:<|"|>object<|"|>
}
}
With that patched template, the same model generated the correct nested tool argument:
some_assertion_field:{ref_ids:[<|"|>s001<|"|>],value:<|"|>positive<|"|>}
and the tool call succeeded.
Note on the recent template update
I saw that chat_template.jinja was updated recently, but the current file still appears to have the same issue: format_parameters still primarily branches on value['type'], still emits type:<|"|>{{ value['type'] | upper }}<|"|>, and I do not see handling for property-level anyOf, property-level $ref, or root $defs in the tool declaration.
Suggested fix
Please consider updating the tool declaration rendering in chat_template.jinja to preserve common JSON Schema constructs:
- render property-level
anyOf - render property-level
$ref - include root
$defs - avoid emitting
type:whentypeis missing - render null/None as
null
This would make Gemma's advertised tool support work better with real-world typed schemas, especially schemas generated by Pydantic and similar libraries.
I just went ahead and made a PR for the template fix, sorry for rambling here