Chat Template Logic Issue: Ambiguous Default Thinking Mode

#27
by QIN2DIM - opened

Problem

When the thinking variable is not explicitly defined by the client, the template defaults to thinking mode ON, causing OpenAI-compatible clients to be unable to distinguish reasoning_content from regular content.

Root Cause

In the add_generation_prompt section:

{%- if thinking is defined and thinking is false -%}
<think></think>
{%- else -%}
<think>
{%- endif -%}
  • When thinking=false: outputs <think></think> βœ“
  • When thinking=true OR undefined: outputs <think> (unclosed) βœ—

When thinking is undefined:

  1. Template ends with <think>
  2. Model generates: reasoning_content</think>actual_content
  3. Client receives merged output but cannot determine if thinking was enabled
  4. </think> appears concatenated with content without clear separation

Fix

Invert the default behavior to thinking OFF when undefined:

-{%- if thinking is defined and thinking is false -%}
-<think></think>
-{%- else -%}
+{%- if thinking is defined and thinking is true -%}
 <think>
+{%- else -%}
+<think></think>
 {%- endif -%}

Behavior After Fix

thinking value Output Meaning
true <think> Expect model to generate reasoning
false <think></think> No reasoning
undefined <think></think> No reasoning (safe default)

This ensures clients can reliably parse responses by checking for <think></think> (thinking off) vs <think>content...</think> (thinking on).

Moonshot AI org

Hi, we set thinking mode as default to make it compatible with our official API behavior. The problem you describe is likely a bug in the reasoning parser. For example, sglang fixed a similar issue recently: https://github.com/sgl-project/sglang/pull/17901

It seems to be true. 🀯

QIN2DIM changed discussion status to closed

That's not problem. but suffix index seems wrong.

current chat_template includes:

{# split all messages into history & suffix, reasoning_content in suffix should be reserved.#}
{%- set hist_msgs = messages[:ns.last_non_tool_call_assistant_msg+1] -%}
{%- set suffix_msgs = messages[ns.last_non_tool_call_assistant_msg+1:] -%}

But it should be

{# split all messages into history & suffix, reasoning_content in suffix should be reserved.#}
{%- set hist_msgs = messages[:ns.last_non_tool_call_assistant_msg] -%}
{%- set suffix_msgs = messages[ns.last_non_tool_call_assistant_msg:] -%}

Sign up or log in to comment