Why most custom model are stoping in the middle, like cutted too early?

#2304

by Unpredicted - opened 10 days ago

•

I tried this model mradermacher/gemma4-E4B-it-disinhibited-GGUF to create a plan from my prompt, but every time a tool_calling occurred, it would stop in the middle. It was like the chat was cut off. my image as a example:

so as you can see where the model should use mkdir, it when cutted...
It also tested on llama.cpp in pi-coding-agent cli

RichardErkhov

9 days ago

no idea what is py coding agent, but try increasing max output tokens

Unpredicted

8 days ago

So far i already on 65k token (if I put it the default model token like example goes to 131072 token, my vram is on dying aka will generate token very slow). But that problem occur even when less than 5k... so i assume it it wasnt the token were the cause, but tools_calling were I suspect it.

RichardErkhov

8 days ago

possibly it has an end token at tool calling, so you need to somehow handle it, like autocontinue after tool call or somehting

Unpredicted

6 days ago

May you by any chance know how? if dont know how I understand it ngl, but if you know how let me know 👍

RichardErkhov

6 days ago

just ask some ai to help you with it, Im not sure what or how you are using it

Unpredicted

6 days ago

ok, thanks for the respond btw 👍good luck there

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment