Embed chat_template in tokenizer_config.json

#8
by piero-atelico - opened
No description provided.

What

Adds the chat template directly to tokenizer_config.json so that tokenizer.apply_chat_template() works out of the box without needing to separately download and load chat_template.jinja.

Why

Right now the chat template is only in chat_template.jinja as a separate file. The transformers library auto-loads it when it's present in the model directory, but many third-party tools and deployment pipelines only copy the standard tokenizer files (tokenizer.json + tokenizer_config.json). When the .jinja file is missing, tokenizer.chat_template is None and apply_chat_template() fails with:

ValueError: Cannot use chat template functions because tokenizer.chat_template is not set

Other Gemma models (Gemma 2, Gemma 3) embed the template in tokenizer_config.json, so this seems like an oversight in the Gemma 4 release.

What changed

Embedded the contents of chat_template.jinja into the chat_template field of tokenizer_config.json. No functional change, just makes the template accessible through the standard API.

Related issue: https://github.com/huggingface/transformers/issues/45205

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment