Instructions to use google/gemma-4-12B-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/gemma-4-12B-it with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("google/gemma-4-12B-it") model = AutoModelForMultimodalLM.from_pretrained("google/gemma-4-12B-it") - Notebooks
- Google Colab
- Kaggle
prompt template
I fine tuned Gemma-3 and Gemma-4 for machine translation and the result of Gemma-4-31-it is much worse compared to Gemma-3-12B.
I am wondering if the cause is the chat_template. When applying the chat template using the tokenizer of Gemma-4 on a list of messages I get the following:
msg = [
{
"role": "user",
"content": "Translate the text below from English to German:\nRegulation"
},
{
"role": "assistant",
"content": "Regulierung"
}
]
tokenizer.apply_chat_template(msg, add_generation_prompt=True, enable_thinking=False, tokenize=False)
The output is:
'<|turn>user\nUsing the formal tone, translate the text below from English to German:\nRegulation<turn|>\n<|turn>model\nRegulierung<turn|>\n<|turn>model\n<|channel>thought\n<channel|>'
It adds the assistance/model twice.
For inference, I have the same issue, for example:
msg = [
{
"role": "user",
"content": "Translate the text below from English to German:\nRegulation"
}
]
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
tokenizer.apply_chat_template(msg, add_generation_prompt=True, tokenize=False, enable_thinking=False)
tokenizer.apply_chat_template(
msg,
add_generation_prompt=True,
enable_thinking=False,
tokenize=False
)
The output is:
'<|turn>user\nTranslate the text below from English to German:\nRegulation<turn|>\n<|turn>model\n<|channel>thought\n<channel|>'
And the model generate garbage, but if I remove <|channel>thought\n<channel|>, the model generate text that makes sense.
I am wondering if the chat template makes the training of Gemma-4 results much worse than Gemma-3.