Hunterx
/

MinimaxM2.7FixedTemplate

Model card Files Files and versions

xet

Community

Hunterx commited on May 6

Commit

4cb7dc5

verified ·

1 Parent(s): 2e77311

Update README.md

Browse files

Files changed (1) hide show

README.md +88 -3

README.md CHANGED Viewed

@@ -1,3 +1,88 @@
----
-license: mit
----

+# MiniMax M2.7 — Fixed Chat Template
+A community-fixed Jinja2 chat template for **MiniMax M2.7** (all quant formats — GGUF, MLX, etc.) that resolves tool calling failures, thinking block corruption, and improves multi-turn agent reliability.
+Drop-in replacement for the default `tokenizer_config.json` chat template.
+## Why This Fix?
+The stock MiniMax M2.7 chat template has several issues that cause tool calling to silently break, especially in agentic coding workflows (Kilo Code, Continue, OpenClaw, etc.). These failures don't produce errors — the model just generates malformed output that the harness can't parse.
+This fixed template resolves all known issues while maintaining full backward compatibility with the original MiniMax M2.7 special token format.
+## Changes from Stock Template
+### Critical Fixes
+| Issue | Stock Behavior | Fixed Behavior |
+|---|---|---|
+| **Unclosed `<think>` before tool calls** | Model reasoning leaks into `<minimax:tool_call>` XML, corrupting structured output | Auto-detects unclosed `<think>` tags and inserts `</think>` before the first tool call |
+| **No tool calling instructions** | System prompt only shows format example, no behavioral rules | Injects strict rules: close thinking first, reasoning before not after tool calls, include all required params, preserve user strings verbatim |
+| **String-form tool arguments** | Assumes `arguments` is always a dict — breaks with `TypeError` when model outputs JSON string | New `render_tool_arguments` macro parses string-form JSON into proper key-value parameters |
+| **Only `</think>` recognized** | `</thinking>` variant is silently ignored, leaving reasoning content in the response body | Both `</think>` and `</thinking>` are recognized as valid close tags |
+### Quality of Life Improvements
+| Feature | Description |
+|---|---|
+| **Think toggles** | `<\|think_on\|>` / `<\|think_off\|>` in system, user, or assistant messages to control thinking per-message |
+| **Developer role** | Supports `developer` role as first message (same as `system` but distinct for frameworks that use it) |
+| **Historical reasoning hidden** | Previous turns' `<think>` blocks are stripped from context by default, saving tokens. Only the current generation's thinking is shown. Set `preserve_thinking=true` to show all. |
+| **LM Studio compatible** | Uses `is iterable` instead of `is sequence` for compatibility with LM Studio's limited Jinja runtime |
+| **Better error messages** | Clearer exceptions for unexpected argument types and role ordering |
+### Preserved from Original
+- MiniMax special tokens: `]~!b[`, `]~b]`, `[e~[` (BOS, role start, role end)
+- `<minimax:tool_call>` / `</minimax:tool_call>` tool call delimiters
+- `<invoke>` / `<parameter>` XML structure for tool calls
+- `<response>` wrapping for tool results with multi-response grouping
+- Default model identity string
+- `current_date` and `current_location` system message fields
+- `ensure_ascii=False` for non-ASCII safe JSON rendering
+## Installation
+### Option 1: Replace in tokenizer_config.json
+Copy the contents of `minimax_m2.7_fixed_template.jinja` into the `chat_template` field of your model's `tokenizer_config.json`.
+### Option 2: LM Studio
+Go to **My Models → model settings → Prompt Template** and paste the template contents.
+### Option 3: llama.cpp / llama-server
+```bash
+llama-server -m your-model.gguf --chat-template-file minimax_m2.7_fixed_template.jinja
+```
+### Option 4: ik_llama.cpp
+```bash
+ik_llama_server -m your-model.gguf --chat-template-file minimax_m2.7_fixed_template.jinja
+```
+## Tested With
+- **MiniMax M2.7 GGUF Q8** (Unsloth) via llama.cpp / ik_llama.cpp on M3 Ultra 512GB — 21 t/s generation, clean tool calling
+- **Kilo Code** (VS Code) — repeated code edits with no tool call corruption
+- **LM Studio** — template loads and renders without errors
+- **Hermes agent framework** — multi-turn agentic workflows stable
+## Credits
+Template fixes by [Hunterx](https://huggingface.co/Hunterx), based on the community fix pattern established by:
+- [fakezeta's merged Qwen 3.6 template](https://gist.github.com/fakezeta/9e8e039c60332fcb143c6e805558afe0)
+- [allanchan339's vLLM Qwen chat template fix](https://github.com/allanchan339/vLLM-Qwen3-3.5-3.6-chat-template-fix)
+- [froggeric's Qwen Fixed Chat Templates](https://huggingface.co/froggeric/Qwen-Fixed-Chat-Templates)
+## Related
+- [MiniMax M2.7 official model](https://huggingface.co/MiniMax/MiniMax-M2.7)
+- [Unsloth GGUF quants](https://huggingface.co/unsloth/MiniMax-M2.7-GGUF)
+## License
+Same license as the original MiniMax M2.7 model.