Hunterx commited on
Commit
4cb7dc5
·
verified ·
1 Parent(s): 2e77311

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +88 -3
README.md CHANGED
@@ -1,3 +1,88 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MiniMax M2.7 — Fixed Chat Template
2
+
3
+ A community-fixed Jinja2 chat template for **MiniMax M2.7** (all quant formats — GGUF, MLX, etc.) that resolves tool calling failures, thinking block corruption, and improves multi-turn agent reliability.
4
+
5
+ Drop-in replacement for the default `tokenizer_config.json` chat template.
6
+
7
+ ## Why This Fix?
8
+
9
+ The stock MiniMax M2.7 chat template has several issues that cause tool calling to silently break, especially in agentic coding workflows (Kilo Code, Continue, OpenClaw, etc.). These failures don't produce errors — the model just generates malformed output that the harness can't parse.
10
+
11
+ This fixed template resolves all known issues while maintaining full backward compatibility with the original MiniMax M2.7 special token format.
12
+
13
+ ## Changes from Stock Template
14
+
15
+ ### Critical Fixes
16
+
17
+ | Issue | Stock Behavior | Fixed Behavior |
18
+ |---|---|---|
19
+ | **Unclosed `<think>` before tool calls** | Model reasoning leaks into `<minimax:tool_call>` XML, corrupting structured output | Auto-detects unclosed `<think>` tags and inserts `</think>` before the first tool call |
20
+ | **No tool calling instructions** | System prompt only shows format example, no behavioral rules | Injects strict rules: close thinking first, reasoning before not after tool calls, include all required params, preserve user strings verbatim |
21
+ | **String-form tool arguments** | Assumes `arguments` is always a dict — breaks with `TypeError` when model outputs JSON string | New `render_tool_arguments` macro parses string-form JSON into proper key-value parameters |
22
+ | **Only `</think>` recognized** | `</thinking>` variant is silently ignored, leaving reasoning content in the response body | Both `</think>` and `</thinking>` are recognized as valid close tags |
23
+
24
+ ### Quality of Life Improvements
25
+
26
+ | Feature | Description |
27
+ |---|---|
28
+ | **Think toggles** | `<\|think_on\|>` / `<\|think_off\|>` in system, user, or assistant messages to control thinking per-message |
29
+ | **Developer role** | Supports `developer` role as first message (same as `system` but distinct for frameworks that use it) |
30
+ | **Historical reasoning hidden** | Previous turns' `<think>` blocks are stripped from context by default, saving tokens. Only the current generation's thinking is shown. Set `preserve_thinking=true` to show all. |
31
+ | **LM Studio compatible** | Uses `is iterable` instead of `is sequence` for compatibility with LM Studio's limited Jinja runtime |
32
+ | **Better error messages** | Clearer exceptions for unexpected argument types and role ordering |
33
+
34
+ ### Preserved from Original
35
+
36
+ - MiniMax special tokens: `]~!b[`, `]~b]`, `[e~[` (BOS, role start, role end)
37
+ - `<minimax:tool_call>` / `</minimax:tool_call>` tool call delimiters
38
+ - `<invoke>` / `<parameter>` XML structure for tool calls
39
+ - `<response>` wrapping for tool results with multi-response grouping
40
+ - Default model identity string
41
+ - `current_date` and `current_location` system message fields
42
+ - `ensure_ascii=False` for non-ASCII safe JSON rendering
43
+
44
+ ## Installation
45
+
46
+ ### Option 1: Replace in tokenizer_config.json
47
+
48
+ Copy the contents of `minimax_m2.7_fixed_template.jinja` into the `chat_template` field of your model's `tokenizer_config.json`.
49
+
50
+ ### Option 2: LM Studio
51
+
52
+ Go to **My Models → model settings → Prompt Template** and paste the template contents.
53
+
54
+ ### Option 3: llama.cpp / llama-server
55
+
56
+ ```bash
57
+ llama-server -m your-model.gguf --chat-template-file minimax_m2.7_fixed_template.jinja
58
+ ```
59
+
60
+ ### Option 4: ik_llama.cpp
61
+
62
+ ```bash
63
+ ik_llama_server -m your-model.gguf --chat-template-file minimax_m2.7_fixed_template.jinja
64
+ ```
65
+
66
+ ## Tested With
67
+
68
+ - **MiniMax M2.7 GGUF Q8** (Unsloth) via llama.cpp / ik_llama.cpp on M3 Ultra 512GB — 21 t/s generation, clean tool calling
69
+ - **Kilo Code** (VS Code) — repeated code edits with no tool call corruption
70
+ - **LM Studio** — template loads and renders without errors
71
+ - **Hermes agent framework** — multi-turn agentic workflows stable
72
+
73
+ ## Credits
74
+
75
+ Template fixes by [Hunterx](https://huggingface.co/Hunterx), based on the community fix pattern established by:
76
+
77
+ - [fakezeta's merged Qwen 3.6 template](https://gist.github.com/fakezeta/9e8e039c60332fcb143c6e805558afe0)
78
+ - [allanchan339's vLLM Qwen chat template fix](https://github.com/allanchan339/vLLM-Qwen3-3.5-3.6-chat-template-fix)
79
+ - [froggeric's Qwen Fixed Chat Templates](https://huggingface.co/froggeric/Qwen-Fixed-Chat-Templates)
80
+
81
+ ## Related
82
+
83
+ - [MiniMax M2.7 official model](https://huggingface.co/MiniMax/MiniMax-M2.7)
84
+ - [Unsloth GGUF quants](https://huggingface.co/unsloth/MiniMax-M2.7-GGUF)
85
+
86
+ ## License
87
+
88
+ Same license as the original MiniMax M2.7 model.