ahoybrotherbear's picture
Upload folder using huggingface_hub
3f9241e verified
metadata
base_model: MiniMaxAI/MiniMax-M2.5
library_name: mlx
tags:
  - mlx
  - quantized
  - 4bit
  - minimax_m2
  - text-generation
  - conversational
  - apple-silicon
license: other
license_name: modified-mit
license_link: https://huggingface.co/MiniMaxAI/MiniMax-M2.5/blob/main/LICENSE
pipeline_tag: text-generation

MiniMax-M2.5 4-bit MLX

This is a 4-bit quantized MLX version of MiniMaxAI/MiniMax-M2.5, converted using mlx-lm v0.29.1.

MiniMax-M2.5 is a 229B parameter Mixture of Experts model (10B active parameters) that achieves 80.2% on SWE-Bench Verified and is SOTA in coding, agentic tool use, and search tasks.

Requirements

  • Apple Silicon Mac (M3 Ultra or later recommended)
  • At least 256GB of unified memory

Quick Start

Install mlx-lm:

pip install -U mlx-lm

CLI

mlx_lm.generate \
  --model ahoybrotherbear/MiniMax-M2.5-4bit-MLX \
  --prompt "Hello, how are you?" \
  --max-tokens 256 \
  --temp 0.7

Python

from mlx_lm import load, generate

model, tokenizer = load("ahoybrotherbear/MiniMax-M2.5-4bit-MLX")

messages = [{"role": "user", "content": "Hello, how are you?"}]
prompt = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)

response = generate(
    model, tokenizer,
    prompt=prompt,
    max_tokens=256,
    temp=0.7,
    verbose=True
)
print(response)

Conversion Details

  • Source model: MiniMaxAI/MiniMax-M2.5 (FP8)
  • Converted with: mlx-lm v0.29.1
  • Quantization: 4-bit
  • Original parameters: 229B total / 10B active (MoE)

Original Model

MiniMax-M2.5 was created by MiniMaxAI. See the original model card for full details on capabilities, benchmarks, and license terms.