How to use from
Hermes Agent
Start the MLX server
# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "shieldstackllc/MiniMax-M2.5-REAP-29-mlx-4bit"
Configure Hermes
# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default shieldstackllc/MiniMax-M2.5-REAP-29-mlx-4bit
Run Hermes
hermes
Quick Links

vMLX

MiniMax-M2.5 REAP-29 — MLX 4-bit

MLX 4-bit quantized version of Akicou/MiniMax-M2-5-REAP-29 for efficient local inference on Apple Silicon.

  • Quantization: 4-bit (group size 64, affine mode; router gates at 8-bit)
  • Architecture: MiniMax M2.5 MoE — 62 layers, 180 experts (REAP-pruned from 256), 8 active per token
  • Context: 196K tokens
  • Size: ~85 GB
  • Pruning: 29% of experts removed via REAP (Router Expert Activation Pruning)

Usage

from mlx_lm import load, generate

model, tokenizer = load("shieldstackllc/MiniMax-M2.5-REAP-29-mlx-4bit")
response = generate(model, tokenizer, prompt="Hello!", verbose=True)

Or with vMLX for native macOS inference.

About

MiniMax-M2.5 is a large Mixture-of-Experts language model by MiniMax AI. This variant was pruned to 29% fewer experts by Akicou using REAP (Router Expert Activation Pruning), reducing model size and memory footprint while maintaining strong performance. MLX quantization by vMLX.

Also Available

Made for vMLX

This model was converted and optimized for vMLX — a free, open source macOS native MLX inference engine for Apple Silicon. Download vMLX to run this model locally with zero configuration.

Credits

Contact

For questions, issues, or collaboration: admin@vmlx.net

Downloads last month
34
Safetensors
Model size
162B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for shieldstackllc/MiniMax-M2.5-REAP-29-mlx-4bit

Quantized
(1)
this model