Keeps emitting <tool_call> json into the context

#23
by theliphant - opened

Ever since updating to merged 3.5/6 version randomly stops and emits json for tool_call
Using unsloth/Qwen3.6-27B-MTP-GGUF:Q5_K_XL in llama-server with q8 k/v cache.


 <tool_call>
 <function=bash>
 <parameter=command>
 find Library/PackageCache -name "UniversalResourceData.cs" 2>/dev/null
 </parameter>
 </function>
 </tool_call>

Although this is in pi so I wonder if something broke there 🙃

Looks like it believed it was a thinking block

{
  "type": "message",
  "id": "ccbda428",
  "parentId": "3318fa1e",
  "timestamp": "2026-05-18T16:32:49.150Z",
  "message": {
    "role": "assistant",
    "content": [
      {
        "type": "thinking",
        "thinking": "<tool_call>\n<function=bash>\n<parameter=command>\nfind Library/PackageCache -name \"UniversalResourceData.cs\" 2>/dev/null\n</parameter>\n</function>\n</tool_call>",
        "thinkingSignature": "reasoning_content"
      }
    ],
    "api": "openai-completions",
    "provider": "llama-cpp-mac",
    "model": "Qwen3.6-27B",
    "usage": {
      "input": 671,
      "output": 43,
      "cacheRead": 58402,
      "cacheWrite": 0,
      "totalTokens": 59116,
      "cost": {
        "input": 0,
        "output": 0,
        "cacheRead": 0,
        "cacheWrite": 0,
        "total": 0
      }
    },
    "stopReason": "stop",
    "timestamp": 1779121949860,
    "responseId": "chatcmpl-xVLlDQ1nqLjtHoIgiuaBwOcwrNYRto06"
  }
}

Hi @theliphant , if you want you give my just released template rebuild project a try.

https://huggingface.co/StableQuant/Qwen-Templates-Rebuild-Project/

If v1.0 doesnt fixes it for you please tell me. If it works tell me too :-)

It will then be added into the research pipeline

Regards

This comment has been hidden (marked as Off-Topic)

In v20, I rewrote the tool formatting instructions to be much stricter about keeping XML structures and avoiding JSON drift. Upgrading to the latest template should fix this behavior.

I just checked, v20 still happens then stops.

<tool_call>
<function=bash>
.....
</function>
</tool_call>

💀
spiritbuun template isn't affected, however It exhibits phenomenon of repeatedly calling duplicate tools when using Chrome DevTools on the Pi on both your new template and bun.

Taming this thing is really annoying. Qwen want us to pay for best.

Yes, unfortunately as this is not enforced algorithmically but through a system prompt, it is not 100% reliable.

I'm not sure:
auto_disable_thinking_with_tools = false will cause a tool_call 💀
auto_disable_thinking_with_tools = true will not cause an error but sometimes causes a duplicate.

Sign up or log in to comment