NeuroSenko
/

MiniMax-M2.7-exl3

Text Generation

Model card Files Files and versions

NeuroSenko commited on about 13 hours ago

Commit

747f2b9

·

verified ·

1 Parent(s): 9272d3d

Update README.md

Files changed (1) hide show

README.md +10 -0

README.md CHANGED Viewed

@@ -38,3 +38,13 @@ This causes Cholesky decomposition to fail during quantization, as the Hessian m
 Note that inf/NaN values are present in the **original model** during inference as well — both the quantized and original models produce NaN perplexity. This appears to be caused by numerically unstable expert weights that produce overflow during forward pass, not by the quantizer itself. The same layer (`blk.61.ffn_down_exps`) [has been identified](https://www.reddit.com/r/LocalLLaMA/comments/1slk4di/) as causing NaN perplexity across GGUF quantizations by multiple providers.
 </details>

 Note that inf/NaN values are present in the **original model** during inference as well — both the quantized and original models produce NaN perplexity. This appears to be caused by numerically unstable expert weights that produce overflow during forward pass, not by the quantizer itself. The same layer (`blk.61.ffn_down_exps`) [has been identified](https://www.reddit.com/r/LocalLLaMA/comments/1slk4di/) as causing NaN perplexity across GGUF quantizations by multiple providers.
 </details>
+### Setup tool calling for TabbyAPI
+Add `tool_format: minimax_m2` to your `config.yml` or per-model `tabby_config.yml`. Also enable `reasoning: true` to properly separate thinking blocks from output:
+```yaml
+model:
+  tool_format: minimax_m2
+  reasoning: true
+```