[Bug] `enable_thinking` doesn't trigger thinking mode for E4B/E2B

#26

by meeyeong11 - opened Apr 24

Apr 24

Problem

enable_thinking=True works correctly on gemma-4-31B-it and gemma-4-26B-A4B-it , but has no effect on E4B and E2B.

Root cause: The 27B/31B models auto-generate <|channel>thought\n after <|turn>model\n, but E4B/E2B do not. The chat template needs to explicitly add this prefix when enable_thinking=True.

Reproduction

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-E4B-it")
messages = [{"role": "user", "content": "What is 2+2?"}]

result = tokenizer.apply_chat_template(
    messages,
    enable_thinking=True,
    add_generation_prompt=True,
    tokenize=False
)
print(result)

The prompt ends with <|turn>model\n — without <|channel>thought\n prefixed, the model jumps straight into the response, skipping the thinking block entirely.

Proposed Fix

In chat_template.jinja, add the thought channel prefix conditionally:

{%- if add_generation_prompt -%}
    ...
    {{- '<|turn>model\n' -}}
    {%- if enable_thinking is defined and enable_thinking -%}
        {{- '<|channel>thought\n' -}}
    {%- endif -%}
{%- endif -%}

Verification

Tested with vLLM

Before fix: Model outputs direct answer without thinking block
After fix: Model correctly generates thinking block starting with <|channel>thought\n

Note

The same issue exists in E2B.

meeyeong11 changed discussion title from [Bug] `enable_thinking` doesn't trigger thinking mode for E4B/E2B - missing <|channel>thought prefix to [Bug] `enable_thinking` doesn't trigger thinking mode for E4B/E2B Apr 24

osanseviero

Google org May 27

Hi! I'm closing this as it's fixed

osanseviero changed discussion status to closed May 27

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment