MiniMax-M2.5 4-bit MLX
This is a 4-bit quantized MLX version of MiniMaxAI/MiniMax-M2.5, converted using mlx-lm v0.29.1.
MiniMax-M2.5 is a 229B parameter Mixture of Experts model (10B active parameters) that achieves 80.2% on SWE-Bench Verified and is SOTA in coding, agentic tool use, and search tasks.
Requirements
- Apple Silicon Mac (M3 Ultra or later recommended)
- At least 256GB of unified memory
Quick Start
Install mlx-lm:
pip install -U mlx-lm
CLI
mlx_lm.generate \
--model ahoybrotherbear/MiniMax-M2.5-4bit-MLX \
--prompt "Hello, how are you?" \
--max-tokens 256 \
--temp 0.7
Python
from mlx_lm import load, generate
model, tokenizer = load("ahoybrotherbear/MiniMax-M2.5-4bit-MLX")
messages = [{"role": "user", "content": "Hello, how are you?"}]
prompt = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
response = generate(
model, tokenizer,
prompt=prompt,
max_tokens=256,
temp=0.7,
verbose=True
)
print(response)
Conversion Details
- Source model: MiniMaxAI/MiniMax-M2.5 (FP8)
- Converted with: mlx-lm v0.29.1
- Quantization: 4-bit
- Original parameters: 229B total / 10B active (MoE)
Original Model
MiniMax-M2.5 was created by MiniMaxAI. See the original model card for full details on capabilities, benchmarks, and license terms.
- Downloads last month
- 331
Model size
229B params
Tensor type
BF16
·
U32
·
F32
·
Hardware compatibility
Log In
to add your hardware
4-bit
Model tree for ahoybrotherbear/MiniMax-M2.5-4bit-MLX
Base model
MiniMaxAI/MiniMax-M2.5