Tool Calling (jinja template) Issues

by Bellesteck - opened Sep 28, 2025

Sep 28, 2025

Just a heads up, LM Studio is not picking up the template correctly or something.

  ⎿ API Error: 400 {"error":{"message":"Error from provider(lmstudio,kat-dev@q4_k_m: 400): {\"error\":\"Error rendering prompt with jinja template: \\\"Unknown test:
    sequence\\\".\\n\\nThis is usually an issue with the model's prompt template. If you are using a popular model, you can try to search the model under lmstudio-community, which will
     have fixed prompt templates. If you cannot find one, you are welcome to post this issue to our discord or issue tracker on GitHub. Alternatively, if you know how to write jinja
    templates, you can override the prompt template in My Models > model settings > Prompt Template.\"}Error: Error from provider(lmstudio,kat-dev@q4_k_m: 400): {\"error\":\"Error
    rendering prompt with jinja template: \\\"Unknown test: sequence\\\".\\n\\nThis is usually an issue with the model's prompt template. If you are using a popular model, you can try
    to search the model under lmstudio-community, which will have fixed prompt templates. If you cannot find one, you are welcome to post this issue to our discord or issue tracker on
    GitHub. Alternatively, if you know how to write jinja templates, you can override the prompt template in My Models > model settings > Prompt Template.\"}\n    at nt
    (C:\\Users\\Marshall\\AppData\\Roaming\\npm\\node_modules\\@musistudio\\claude-code-router\\dist\\cli.js:77001:11)\n    at l0
    (C:\\Users\\Marshall\\AppData\\Roaming\\npm\\node_modules\\@musistudio\\claude-code-router\\dist\\cli.js:77059:11)\n    at process.processTicksAndRejections
    (node:internal/process/task_queues:105:5)\n    at async a0
    (C:\\Users\\Marshall\\AppData\\Roaming\\npm\\node_modules\\@musistudio\\claude-code-router\\dist\\cli.js:77026:96)","type":"api_error","code":"provider_response_error"}}

mradermacher

Owner Sep 28, 2025

Looks like lack of support by lm studio for this model. Have you tried a current llama.cpp version?

Bellesteck

Sep 29, 2025

•

edited Sep 29, 2025

Tried it with llama.cpp compiled from source - don't get the same error, but it does not parse OpenAI-type tool calls. Looking at the original template for the model it appears to be a vllm chat template for Qwen3-coder.

mradermacher

Owner Oct 1, 2025

You'll probably have to wait until llm studio supports the template then (although it's not clear what you mean with "original" template - there is only the template, there aren't different versions for a specific model)

akierum

Oct 1, 2025

Hello, I see calls fail with cline, roocode, killocode.

Is this the system prompt problem or what?

akierum

Oct 1, 2025

•

edited Oct 1, 2025

this is with lmstudio 0.3.28b1

akierum

Oct 1, 2025

•

edited Oct 1, 2025

This is if you continue to use it.

Ideas how to fix it?

Bellesteck

Oct 1, 2025

You'll probably have to wait until llm studio supports the template then (although it's not clear what you mean with "original" template - there is only the template, there aren't different versions for a specific model)

Right, by original I mean that I did not override the chat template in the model in LM Studio or in LLAMA.cpp.

Bellesteck changed discussion title from Template issues in LM Studio to Template issues in LM Studio & LLAMA.cpp Oct 1, 2025

Bellesteck changed discussion title from Template issues in LM Studio & LLAMA.cpp to Tool Calling (jinja template) Issues Oct 1, 2025

Bellesteck

Oct 1, 2025

•

edited Oct 1, 2025

@akierum I'm pretty sure that Cline & Roo Code do not use OpenAI-type tool calls, but instead work by parsing the model output directly. Thus, it is likely a separate issue. Which quant are you running?

akierum

Oct 1, 2025

@akierum I'm pretty sure that Cline & Roo Code do not use OpenAI-type tool calls, but instead work by parsing the model output directly. Thus, it is likely a separate issue. Which quant are you running?

I use Q8

Bellesteck

Oct 3, 2025

@akierum I think your issue is related to KV Cache quantization and/or flash attention. I see the same kinds of issues with Qwen3 Coder, Gemma3, and GPT-OSS when enabling flash attention. I suspect an issue in llama.cpp v1.52 but I've noticed similar issues since v1.50 when using flash attention.

akierum

Oct 4, 2025

No this happens if it is on or off

Bellesteck

Oct 5, 2025

Hey folks, looks like the LM Studio community has resolved the template with this: https://huggingface.co/lmstudio-community/KAT-Dev-GGUF

Closing ticket.

Bellesteck changed discussion status to closed Oct 5, 2025

akierum

Oct 8, 2025

•

edited Oct 8, 2025

Well I tried the new DevQuasar Kwaipilot.KAT-Dev.Q8_0.gguf
Now no more errors in Cline. Lets see if output is worth it, as previous version was way worse the Qwen3 coder 30b

Update:

Lmstudio still errors out:
2025-10-08 23:50:30 [INFO]
[LM STUDIO SERVER] Running chat completion on conversation with 30 messages.
2025-10-08 23:50:30 [INFO]
[LM STUDIO SERVER] Streaming response...
2025-10-08 23:51:06 [ERROR]
The model has crashed without additional information. (Exit code: 18446744072635812000). Error Data: n/a, Additional Data: n/a
2025-10-08 23:51:07 [INFO]
[JIT] Requested model (kwaipilot.kat-dev) is not loaded. Loading "DevQuasar/Kwaipilot.KAT-Dev-GGUF/Kwaipilot.KAT-Dev.Q8_0.gguf" now...
2025-10-08 23:52:38 [INFO]
[LM STUDIO SERVER] Running chat completion on conversation with 30 messages.
2025-10-08 23:52:38 [INFO]
[LM STUDIO SERVER] Streaming response...
2025-10-08 23:56:07 [ERROR]
The model has crashed without additional information. (Exit code: 18446744072635812000). Error Data: n/a, Additional Data: n/a
2025-10-08 23:56:09 [INFO]
[JIT] Requested model (kwaipilot.kat-dev) is not loaded. Loading "DevQuasar/Kwaipilot.KAT-Dev-GGUF/Kwaipilot.KAT-Dev.Q8_0.gguf" now...

akierum

Oct 8, 2025

Jan.ai also errors out:

Invalid API Response: The provider returned an empty or unparsable response. This is a provider-side issue where the model failed to generate valid output or returned tool calls that Cline cannot process. Retrying the request may help resolve this issue.

API Request Failed$0.0000

502 Proxy request to model failed: error sending request for url (http://127.0.0.1:3643/chat/completions): error trying to connect: tcp connect error: No connection could be made because the target machine actively refused it. (os error 10061)

[22:04:56]
ERROR
Proxy request to model failed: error sending request for url (http://127.0.0.1:3643/chat/completions): error trying to connect: tcp connect error: No connection could be made because the target machine actively refused it. (os error 10061)
[22:04:57]
DEBUG
Handling POST request to /chat/completions requiring model lookup in body
[22:04:57]
DEBUG
Extracted model_id: Kwaipilot.KAT-Dev.Q8_0
[22:04:57]
DEBUG
Found session for model_id Kwaipilot.KAT-Dev.Q8_0
[22:04:57]
DEBUG
Adding session Authorization header
[22:04:57]
DEBUG
Sending buffered body (221820 bytes)
[22:04:59]
ERROR
Proxy request to model failed: error sending request for url (http://127.0.0.1:3643/chat/completions): error trying to connect: tcp connect error: No connection could be made because the target machine actively refused it. (os error 10061)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment