LongCat-Next does not emit opening tag during tool calling

by kernelpool - opened Apr 6

Apr 6

Hi,

I'm integrating LongCat-Next into mlx-lm (I also worked on the dgram implementation for LongCat-Flash-Lite) and noticed the model doesn't emit the <longcat_tool_call> opening token (id 42) when making tool calls after <longcat_assistant>.

Instead, greedy decoding produces {function-name} (the literal template placeholder), followed by the tool call body using <longcat_arg_value> / <longcat_arg_key>tags, then a correct </longcat_tool_call> closing tag.

Top-2 logits at the first generation position (with tools in the system prompt):

Token	Logit
`{` (id 314)	~21
`<longcat_tool_call>` (id 42)	~17

This is consistent across bf16 (full precision) and 4/6/8-bit quantizations, ruling out quantization artifacts. When I manually prefill <longcat_tool_call> into the prompt, the model produces well-formed output (func_name\n<longcat_arg_key>...).

For comparison, LongCat-Flash-Lite with the same chat template correctly emits <longcat_tool_call> as the clear top-1 prediction (logit ~29 vs I at ~23).

Perhaps something happened to the tool calling after multimodal fine-tuning?

Thanks again for this model and your hard work!

Locke

Apr 7

Hi @kernelpool ,

Thank you so much for your work on integrating LongCat-Next into mlx-lm!

Using the test example in the README.md(greedy), the output seems normal:

output_text='<longcat_tool_call>func_add\n<longcat_arg_key>x1</longcat_arg_key>\n<longcat_arg_value>125679</longcat_arg_value>\n<longcat_arg_key>x2</longcat_arg_key>\n<longcat_arg_value>234519</longcat_arg_value>\n</longcat_tool_call>\n'

parsed_message={'role': 'assistant', 'content': '', 'tool_calls': [{'id': 'tool-call-7f10f23c-0f60-40d5-875c-d0ffabc57b18', 'type': 'function', 'function': {'name': 'func_add', 'arguments': {'x1': 125679, 'x2': 234519}}}]}

Could you provide your test cases? Let's take a deeper look together.

kernelpool

Apr 22

•

edited Apr 22

Thanks for checking! False alarm - mlx-lm's thinking-mode auto-detection logic accidentally fired on this model (since is in the vocab) so it was handled as a thinking model.

kernelpool changed discussion status to closed Apr 22

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment