Chat Template Logic Issue: Ambiguous Default Thinking Mode
Problem
When the thinking variable is not explicitly defined by the client, the template defaults to thinking mode ON, causing OpenAI-compatible clients to be unable to distinguish reasoning_content from regular content.
Root Cause
In the add_generation_prompt section:
{%- if thinking is defined and thinking is false -%}
<think></think>
{%- else -%}
<think>
{%- endif -%}
- When
thinking=false: outputs<think></think>β - When
thinking=trueOR undefined: outputs<think>(unclosed) β
When thinking is undefined:
- Template ends with
<think> - Model generates:
reasoning_content</think>actual_content - Client receives merged output but cannot determine if thinking was enabled
</think>appears concatenated with content without clear separation
Fix
Invert the default behavior to thinking OFF when undefined:
-{%- if thinking is defined and thinking is false -%}
-<think></think>
-{%- else -%}
+{%- if thinking is defined and thinking is true -%}
<think>
+{%- else -%}
+<think></think>
{%- endif -%}
Behavior After Fix
thinking value |
Output | Meaning |
|---|---|---|
true |
<think> |
Expect model to generate reasoning |
false |
<think></think> |
No reasoning |
| undefined | <think></think> |
No reasoning (safe default) |
This ensures clients can reliably parse responses by checking for <think></think> (thinking off) vs <think>content...</think> (thinking on).
Hi, we set thinking mode as default to make it compatible with our official API behavior. The problem you describe is likely a bug in the reasoning parser. For example, sglang fixed a similar issue recently: https://github.com/sgl-project/sglang/pull/17901
It seems to be true. π€―
That's not problem. but suffix index seems wrong.
current chat_template includes:
{# split all messages into history & suffix, reasoning_content in suffix should be reserved.#}
{%- set hist_msgs = messages[:ns.last_non_tool_call_assistant_msg+1] -%}
{%- set suffix_msgs = messages[ns.last_non_tool_call_assistant_msg+1:] -%}
But it should be
{# split all messages into history & suffix, reasoning_content in suffix should be reserved.#}
{%- set hist_msgs = messages[:ns.last_non_tool_call_assistant_msg] -%}
{%- set suffix_msgs = messages[ns.last_non_tool_call_assistant_msg:] -%}