v19 released with major improvements

#22

by froggeric - opened May 16

Owner May 16

I think I have finally solved the frequent stops in v19. So far it has been flawless in 3 long agentic tests in a row. Previously, I had it happen in around 80% of my sessions.

This has been a tough one to crack. To fix it I had to resort to better prompt engineering:

<IMPORTANT>
Reminder:
- You can use the <think></think> block to plan your next tool call OR to synthesize data and formulate your final response to the user.
- ALL explanation and reasoning MUST be placed strictly inside the <think></think> block.
- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags.
- If you choose to call a tool, you MUST output the <tool_call> block IMMEDIATELY after closing </think>. Do NOT output any conversational text before the tool call.
- The <tool_call> and <function> tags MUST be at the very beginning of a new line, with NO spaces or indentation before them.
- To call multiple functions, output a separate, completely closed <tool_call></tool_call> block for EACH function. Do NOT nest <tool_call> blocks.
- If you have gathered all necessary data and do not need to call a tool, answer the question like normal and provide your final response to the user IMMEDIATELY after closing </think>.
</IMPORTANT>

It helped a bit, but did not solve it. What I think finally did it, was a complete rewrite of the KV cache handling, by setting preserve_thinking to true as default, and abolishing the empty think injection, which was poisoning the model's in-context learning.

xMASEx

May 16

Will give it a shot.

Thank you so much for your time and effort 🌹

slepkaviba

May 17

•

edited May 17

Hey, I did check, and it works well, but issue I observed since v18, still persists with opencode and tool calls bleeding into the message.

Exploring the input and move control system now.
<function=aft_outline>
<parameter=target>
src/features/input
</parameter>
</function>
</tool_call>

Leaks into the prompt, and LLM stops...

froggeric

Owner May 17

Strange, I have 0 instances of it across hundreds of tools calls since then. I am using F16, so maybe this is related to loss of intelligence in lower quants.

slepkaviba

May 17

•

edited May 17

It happens ony with specific plugins..

Ones who modify messages (dcp, magic context).

And it happens with first message...

Without those plugins, it works amazing.

EDIT: Ok, this is for sure some context shananigans... As asking same question, with same prompt, just without context manipulation plugin - it works fine.

With plugins on - it bleeds either tools, thinking or both... And dies after first message, if second is a tool call...

I could provide you messages to compare - one with plugin enabled, and other without - if that is of any help?

froggeric

Owner May 17

•

edited May 17

In that case, I suspect those plugins manipulate the context history and KV cache, which is confusing the model on how to think, how to use tools, how to transition between states, etc. I would recommend not using any such plugin with any kind of model. It's akin to us, humans, have suddenly thoughts and memories disappearing throughout the day as we are working...

slepkaviba

May 17

Yea...

They manipulate messages a lot.

But that's "a" way to manage context

0x4tomic

May 17

I'd definitely pin that on OpenCode - I've had decent success with version 19. https://github.com/NousResearch/hermes-agent/issues/27339 is one such related issue (but for hermes-agent as opposed to OpenCode), so that's why I'm leaning towards the issue being the harness. Everyone is still learning quirks and "dos and don't"s.

I was using either v15 or v18 before with Hermes-Agent 27b (Opus 4.7 distilled) and it was losing its train of thought and getting into reasoning loops. I've been working through the kanban and it has been mostly reliable - no evidence of bad tool calls and it seems to behave well. Keep an eye on your harness's issue trackers!

slepkaviba

May 17

That might be. There are so many moving parts in this world..

mcr-ksh

May 18

Claude Code is still unusable as of v19. Continuous break ups and stalls

slepkaviba

May 18

I switched in opencode from openai to anthropic provider... And with magic context its stable. Still bleeds with dcp. .

froggeric

Owner May 18

It works perfectly for me in Claude Code, using llama-server as an anthropic provider. I am using the Qwen 3.6 27b F16 gguf I published, with the chat template v19.

slepkaviba

May 18

I think anthropic provider is important.

Became very stable (minus dcp injections).

mcr-ksh

May 18

•

edited May 18

im using vllm. I tried with and without on cyburn/Qwopus3.6-35B-A3B-v1-PrismaSCOUT-Blackwell-NVFP4-BF16-vllm-4.75bits with MTP and 256k context.
--default-chat-template-kwargs '{"enable_thinking": false, "auto_disable_thinking_with_tools": true, "max_tool_response_chars": 8192}'
While looking for a solution for my hangs, i've came accross settings default-chat-template-kwargs.

This is just an example of may hangs:

the comes up. However,

UPDATED: another random stop.

ManyOtherFunctions

May 18

I'm actually getting a weird model stop in claude code on ls (running this in windows and powershell)
It happened both in native powershell and default ls)

and I'm not even sure if it fixes the thing where it reads contents of a PE / binary file and then permacrashes because its thinking has unicode tags.... still running that and hoping for the best.

froggeric

Owner May 18

Please post issues in separate threads.

froggeric

Owner May 18

I strongly recommend leaveing preserve_thinking to true (default in this template), and leaving thinking on (default as well). The models perform and reason a lot better.

mcr-ksh

May 19

i've enabled preserve_thinking and thinking. The result is the same. Breakups.

slepkaviba

May 19

I also have them on.

hampsonw

May 20

it seems like people don't realize how many contradictory instructions harnesses like hermes/claude-code/opencode inject and then get surprised when they get bad results. IF you're using a model and want to understand how important things like chat template effect it you really should just be using the pi coding agent. There are no hidden instructions getting injected without your control, it's much more reproducible and scientific than all these random complaints of x,y,z doesn't work in my harness.

theliphant

May 20

fwiw I am having these issues occasionally in a pretty barebones PI actually :D

mcr-ksh

May 21

switching back to your template as of today:

the another one I found digging out and can only confirm that yours breakup my system while the other one "barubary" works great for me:
Maybe this is insightful what is done different here.

{%- set template_version = "qwen3.5-barubary-attuned-v1" %}

{#- ── Config with safe defaults ─────────────────────────────── #}
{%- set image_count = namespace(value=0) %}
{%- set video_count = namespace(value=0) %}
{%- set add_vision_id = add_vision_id if add_vision_id is defined else false %}
{%- set enable_thinking = enable_thinking if enable_thinking is defined else true %}
{%- set auto_disable_thinking_with_tools = auto_disable_thinking_with_tools if auto_disable_thinking_with_tools is defined else false %}
{%- set max_tool_arg_chars = max_tool_arg_chars if max_tool_arg_chars is defined else 0 %}
{%- set max_tool_response_chars = max_tool_response_chars if max_tool_response_chars is defined else 0 %}

{#- FIX21: precompute _has_tools ─────────────────────────────── #}
{%- set _has_tools = (tools is defined and tools and tools is iterable and tools is not mapping) %}

{#- FIX19: auto-disable thinking when tools active ─────────── #}
{%- set _thinking = enable_thinking %}
{%- if auto_disable_thinking_with_tools and _has_tools %}
    {%- set _thinking = false %}
{%- endif %}

{#- ── render_content macro ──────────────────────────────────── #}
{%- macro render_content(content, do_vision_count, is_system_content=false) %}
    {%- if content is string %}
        {{- content }}
    {%- elif content is iterable and content is not mapping %}
        {%- for item in content %}
            {%- if item is mapping %}
                {%- if item.type == 'image' or 'image' in item or 'image_url' in item %}
                    {%- if is_system_content %}
                        {{- raise_exception('System message cannot contain images.') }}
                    {%- endif %}
                    {%- if do_vision_count %}
                        {%- set image_count.value = image_count.value + 1 %}
                    {%- endif %}
                    {%- if add_vision_id %}
                        {{- 'Picture ' ~ image_count.value ~ ': ' }}
                    {%- endif %}
                    {{- '<|vision_start|><|image_pad|><|vision_end|>' }}
                {%- elif item.type == 'video' or 'video' in item %}
                    {%- if is_system_content %}
                        {{- raise_exception('System message cannot contain videos.') }}
                    {%- endif %}
                    {%- if do_vision_count %}
                        {%- set video_count.value = video_count.value + 1 %}
                    {%- endif %}
                    {%- if add_vision_id %}
                        {{- 'Video ' ~ video_count.value ~ ': ' }}
                    {%- endif %}
                    {{- '<|vision_start|><|video_pad|><|vision_end|>' }}
                {%- elif 'text' in item %}
                    {{- item.text }}
                {%- else %}
                    {{- raise_exception('Unexpected item type in content.') }}
                {%- endif %}
            {%- else %}
                {{- item | string }}
            {%- endif %}
        {%- endfor %}
    {%- elif content is none or content is undefined %}
        {{- '' }}
    {%- else %}
        {{- raise_exception('Unexpected content type.') }}
    {%- endif %}
{%- endmacro %}

{#- ── Guard ──────────────────────────────────────────────────── #}
{%- if not messages %}
    {{- raise_exception('No messages provided.') }}
{%- endif %}

{#- ── Extract leading system/developer message ────────────────── #}
{%- set _first_role = messages[0].role %}
{%- if _first_role == 'system' or _first_role == 'developer' %}
    {%- set _sys_msg = messages[0] %}
    {%- set _msgs = messages[1:] %}
{%- else %}
    {%- set _sys_msg = none %}
    {%- set _msgs = messages %}
{%- endif %}

{#- ── Tools block OR plain system block ─────────────────────── #}
{%- if _has_tools %}
    {{- '<|im_start|>system\n' }}
    {{- '# Tools\n\nYou have access to the following functions:\n\n<tools>' }}
    {%- for tool in tools %}
        {{- '\n' }}
        {{- tool | tojson }}
    {%- endfor %}
    {{- '\n</tools>' }}
    {{- '\n\nIf you choose to call a function ONLY reply in the following format with NO suffix:\n\n' }}
    {{- '<tool_call>\n<function=example_function_name>\n' }}
    {{- '<parameter=example_parameter_1>\nvalue_1\n</parameter>\n' }}
    {{- '<parameter=example_parameter_2>\nThis is the value for the second parameter\n' }}
    {{- 'that can span\nmultiple lines\n</parameter>\n</function>\n</tool_call>' }}
    {{- '\n\n<IMPORTANT>\nReminder:\n' }}
    {{- '- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags\n' }}
    {{- '- Required parameters MUST be specified\n' }}
    {{- '- You may provide optional reasoning for your function call in natural language BEFORE the function call, but NOT after\n' }}
    {{- '- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls\n' }}
    {{- '- When making multiple function calls, emit each <tool_call> block separated by a blank line\n' }}
    {{- '</IMPORTANT>' }}
    {%- if _sys_msg is not none %}
        {%- set _sc = render_content(_sys_msg.content, false, true) | trim %}
        {%- if _sc %}
            {{- '\n\n' + _sc }}
        {%- endif %}
    {%- endif %}
    {{- '<|im_end|>\n' }}
{%- else %}
    {%- if _sys_msg is not none %}
        {%- set _sc = render_content(_sys_msg.content, false, true) | trim %}
        {{- '<|im_start|>system\n' + _sc + '<|im_end|>\n' }}
    {%- endif %}
{%- endif %}

{#- ── Compute last real-user-query index ────────────────────── #}
{#- FIX2 #}
{%- set _last_idx = _msgs | length - 1 %}
{%- set ns = namespace(multi_step_tool=true, last_query_index=_last_idx) %}
{%- for message in _msgs[::-1] %}
    {%- set index = (_msgs | length - 1) - loop.index0 %}
    {%- if ns.multi_step_tool and message.role == 'user' %}
        {%- set _rc = render_content(message.content, false) | trim %}
        {%- if not (_rc.startswith('<tool_response>') and _rc.endswith('</tool_response>')) %}
            {%- set ns.multi_step_tool = false %}
            {%- set ns.last_query_index = index %}
        {%- endif %}
    {%- endif %}
{%- endfor %}
{#- FIX17: deep agent loop fallback ────────────────────────── #}
{%- if ns.multi_step_tool %}
    {%- if _last_idx > 50 %}
        {%- set ns.last_query_index = _last_idx %}
    {%- else %}
        {%- set ns.last_query_index = 0 %}
    {%- endif %}
{%- endif %}

{#- ── Main render loop ──────────────────────────────────────── #}
{%- set ns2 = namespace(prev_role='') %}
{%- for message in _msgs %}
    {%- set content = render_content(message.content, true) | trim %}

    {%- if message.role == 'user' %}
        {{- '<|im_start|>user\n' + content + '<|im_end|>\n' }}

    {%- elif message.role == 'assistant' %}
        {#- ── Extract reasoning_content ── #}
        {%- set reasoning_content = '' %}
        {%- if message.reasoning_content is defined and message.reasoning_content is not none %}
            {%- if message.reasoning_content is string %}
                {%- set reasoning_content = message.reasoning_content %}
            {%- else %}
                {%- set reasoning_content = message.reasoning_content | string %}
            {%- endif %}
        {%- else %}
            {%- if '</think>' in content %}
                {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
                {%- set content = content.split('</think>')[-1].lstrip('\n') %}
            {%- endif %}
        {%- endif %}
        {%- set reasoning_content = reasoning_content | trim %}

        {#- ── Render assistant turn ── #}
        {%- if loop.index0 > ns.last_query_index %}
            {%- if _thinking %}
                {{- '<|im_start|>assistant\n<think>\n' + reasoning_content + '\n</think>\n\n' + content }}
            {%- else %}
                {{- '<|im_start|>assistant\n<think>\n\n</think>\n\n' + content }}
            {%- endif %}
        {%- else %}
            {{- '<|im_start|>assistant\n' + content }}
        {%- endif %}

        {#- ── Render tool_calls ── #}
        {%- if message.tool_calls is defined and message.tool_calls
               and message.tool_calls is iterable
               and message.tool_calls is not mapping %}
            {%- for tool_call in message.tool_calls %}
                {#- FIX10 #}
                {%- if tool_call.function is defined and tool_call.function is not none %}
                    {%- set tc = tool_call.function %}
                {%- else %}
                    {%- set tc = tool_call %}
                {%- endif %}

                {#- FIX15+FIX18 #}
                {%- if loop.first %}
                    {%- if content | trim %}
                        {{- '\n\n<tool_call>\n<function=' + tc.name + '>\n' }}
                    {%- else %}
                        {{- '<tool_call>\n<function=' + tc.name + '>\n' }}
                    {%- endif %}
                {%- else %}
                    {{- '\n\n<tool_call>\n<function=' + tc.name + '>\n' }}
                {%- endif %}

                {#- ── Render arguments ── #}
                {%- if tc.arguments is defined and tc.arguments is not none %}
                    {%- if tc.arguments is mapping %}
                        {#- FIX6 #}
                        {%- for args_name, args_value in tc.arguments.items() %}
                            {{- '<parameter=' + args_name + '>\n' }}
                            {#- FIX7+FIX8 #}
                            {%- if args_value is mapping or (args_value is sequence and args_value is not string) %}
                                {%- set _av = args_value | tojson %}
                            {%- else %}
                                {%- set _av = args_value | string %}
                            {%- endif %}
                            {#- FIX16 #}
                            {%- if max_tool_arg_chars > 0 and _av | length > max_tool_arg_chars %}
                                {{- _av[:max_tool_arg_chars] + '\n[TRUNCATED — original length ' ~ (_av | length | string) ~ ' chars]' }}
                            {%- else %}
                                {{- _av }}
                            {%- endif %}
                            {{- '\n</parameter>\n' }}
                        {%- endfor %}
                    {%- elif tc.arguments is string and tc.arguments %}
                        {#- FIX9 #}
                        {{- tc.arguments }}
                    {%- endif %}
                {%- endif %}
                {%- if loop.last %}
                    {{- '</function>\n</tool_call>\n' }}
                {%- else %}
                    {{- '</function>\n</tool_call>' }}
                {%- endif %}
            {%- endfor %}
        {%- endif %}
        {{- '<|im_end|>\n' }}

    {%- elif message.role == 'tool' %}
        {#- FIX11 #}
        {%- if ns2.prev_role != 'tool' %}
            {{- '<|im_start|>user' }}
        {%- endif %}
        {#- FIX16 #}
        {%- if max_tool_response_chars > 0 and content | length > max_tool_response_chars %}
            {%- set content = content[:max_tool_response_chars] + '\n[TRUNCATED — original length ' ~ (content | length | string) ~ ' chars]' %}
        {%- endif %}
        {{- '\n<tool_response>\n' + content + '\n</tool_response>' }}
        {#- FIX11 #}
        {%- if loop.last %}
            {{- '<|im_end|>\n' }}
        {%- else %}
            {%- set _next_role = _msgs[loop.index0 + 1].role %}
            {%- if _next_role != 'tool' %}
                {{- '<|im_end|>\n' }}
            {%- endif %}
        {%- endif %}

    {%- elif message.role == 'system' or message.role == 'developer' %}
        {#- FIX4: skip — already rendered above #}

    {%- else %}
        {#- FIX20 #}
        {{- '<|im_start|>user\n[' + message.role + ']: ' + content + '<|im_end|>\n' }}
    {%- endif %}

    {%- set ns2.prev_role = message.role %}
{%- endfor %}

{#- ── Generation prompt ──────────────────────────────────────── #}
{%- if add_generation_prompt %}
    {{- '<|im_start|>assistant\n' }}
    {%- if not _thinking %}
        {{- '<think>\n\n</think>\n\n' }}
    {%- else %}
        {{- '<think>\n' }}
    {%- endif %}
{%- endif %}

Interpause

May 25

If you choose to call a tool, you MUST output the block IMMEDIATELY after closing . Do NOT output any conversational text before the tool call.

pardon my ignorance, but why is it bad for the agent to converse before a tool call? I personally like when the agent explains what its about to do before doing it. much more succinct than reading the thinking block.

froggeric

Owner Jun 5

Obsoleted by the v20 release.

froggeric changed discussion status to closed Jun 5

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment