MiniMax M2.5
Collection
MINT & SWAN quantized versions of MiniMax-M2.5 (MLX & GGUF) • 3 items • Updated
Mixed-precision quantized version of MiniMaxAI/MiniMax-M2.5 using SWAN.
SWAN beats uniform 4-bit: -1.9% PPL, -1.7% size.
| Metric | Value |
|---|---|
| Size | 118 GB |
| Average bits | 3.77 |
| Framework | MLX |
| WikiText-2 PPL | 8.787 (mean) |
| Uniform 4-bit PPL | 8.957 |
Quality vs size trade-off from MINT MCKP allocator. ★ = optimal knee point.
| Budget | Size | Avg Bits | Loss | |
|---|---|---|---|---|
| 95 GB | 95.0 GB | 3.1 | 155.2783 | |
| 114 GB | 114.2 GB | 4.0 | 22.3760 | ★ |
| 133 GB | 133.4 GB | 4.5 | 14.9543 | |
| 153 GB | 152.6 GB | 5.2 | 10.2674 | |
| 172 GB | 171.8 GB | 6.0 | 5.5744 | |
| 191 GB | 191.1 GB | 6.8 | 3.1894 | |
| 210 GB | 210.3 GB | 7.6 | 1.9911 | |
| 230 GB | 229.5 GB | 8.1 | 1.0657 | |
| 249 GB | 247.8 GB | 8.9 | 0.9171 | |
| 268 GB | 267.8 GB | 9.7 | 0.8132 | |
| 287 GB | 286.8 GB | 10.4 | 0.7152 | |
| 306 GB | 305.8 GB | 11.2 | 0.6174 | |
| 326 GB | 324.7 GB | 11.9 | 0.5197 | |
| 345 GB | 343.7 GB | 12.7 | 0.4221 | |
| 364 GB | 363.8 GB | 13.5 | 0.3191 | |
| 383 GB | 382.8 GB | 14.3 | 0.2216 | |
| 402 GB | 401.7 GB | 15.0 | 0.1242 | |
| 422 GB | 420.7 GB | 15.8 | 0.0269 | |
| 441 GB | 426.0 GB | 16.0 | 0.0000 | |
| 460 GB | 426.0 GB | 16.0 | 0.0000 |
Generated by MINT rate-distortion optimization.
from mlx_lm import load, generate
model, tokenizer = load("baa-ai/MiniMax-M2.5-SWAN-4bit-MLX")
response = generate(model, tokenizer, prompt="Hello!", max_tokens=256)
print(response)
SWAN uses data-free per-tensor sensitivity analysis with composite scoring to allocate bit-widths across model layers.
Quantized by baa.ai
4-bit
Base model
MiniMaxAI/MiniMax-M2.5