NeuroSenko commited on
Commit
747f2b9
·
verified ·
1 Parent(s): 9272d3d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -0
README.md CHANGED
@@ -38,3 +38,13 @@ This causes Cholesky decomposition to fail during quantization, as the Hessian m
38
 
39
  Note that inf/NaN values are present in the **original model** during inference as well — both the quantized and original models produce NaN perplexity. This appears to be caused by numerically unstable expert weights that produce overflow during forward pass, not by the quantizer itself. The same layer (`blk.61.ffn_down_exps`) [has been identified](https://www.reddit.com/r/LocalLLaMA/comments/1slk4di/) as causing NaN perplexity across GGUF quantizations by multiple providers.
40
  </details>
 
 
 
 
 
 
 
 
 
 
 
38
 
39
  Note that inf/NaN values are present in the **original model** during inference as well — both the quantized and original models produce NaN perplexity. This appears to be caused by numerically unstable expert weights that produce overflow during forward pass, not by the quantizer itself. The same layer (`blk.61.ffn_down_exps`) [has been identified](https://www.reddit.com/r/LocalLLaMA/comments/1slk4di/) as causing NaN perplexity across GGUF quantizations by multiple providers.
40
  </details>
41
+
42
+ ### Setup tool calling for TabbyAPI
43
+
44
+ Add `tool_format: minimax_m2` to your `config.yml` or per-model `tabby_config.yml`. Also enable `reasoning: true` to properly separate thinking blocks from output:
45
+
46
+ ```yaml
47
+ model:
48
+ tool_format: minimax_m2
49
+ reasoning: true
50
+ ```