Instructions to use majentik/MiniMax-M2.7-RotorQuant-MLX-5bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use majentik/MiniMax-M2.7-RotorQuant-MLX-5bit with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("majentik/MiniMax-M2.7-RotorQuant-MLX-5bit")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps
LM Studio

Pi new

How to use majentik/MiniMax-M2.7-RotorQuant-MLX-5bit with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "majentik/MiniMax-M2.7-RotorQuant-MLX-5bit"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "majentik/MiniMax-M2.7-RotorQuant-MLX-5bit"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use majentik/MiniMax-M2.7-RotorQuant-MLX-5bit with Hermes Agent:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "majentik/MiniMax-M2.7-RotorQuant-MLX-5bit"

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default majentik/MiniMax-M2.7-RotorQuant-MLX-5bit

Run Hermes

hermes

MLX LM

How to use majentik/MiniMax-M2.7-RotorQuant-MLX-5bit with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "majentik/MiniMax-M2.7-RotorQuant-MLX-5bit"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "majentik/MiniMax-M2.7-RotorQuant-MLX-5bit"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "majentik/MiniMax-M2.7-RotorQuant-MLX-5bit",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

majentik commited on 10 days ago

Commit

249ace0

verified ·

1 Parent(s): f638f0a

docs: Tier 2 polish — variant matrix + quant trade-off

Browse files

Files changed (1) hide show

README.md +39 -9

README.md CHANGED Viewed

@@ -6,15 +6,13 @@ license: other
 license_name: minimax-model-license
 license_link: https://huggingface.co/MiniMaxAI/MiniMax-M2.7/blob/main/LICENSE
 tags:
-- minimax
-- m2.7
-- moe
-- quantized
-- rotorquant
-- kv-cache-quantization
-- mlx
-language:
-- en
 ---
 # MiniMax-M2.7-RotorQuant-MLX-5bit
@@ -91,3 +89,35 @@ print(response)
 - [majentik/MiniMax-M2.7-TurboQuant-MLX-5bit](https://huggingface.co/majentik/MiniMax-M2.7-TurboQuant-MLX-5bit) -- TurboQuant MLX 5-bit
 - [majentik/MiniMax-M2.7-RotorQuant-MLX-8bit](https://huggingface.co/majentik/MiniMax-M2.7-RotorQuant-MLX-8bit) -- MLX 8-bit
 - [majentik/MiniMax-M2.7-RotorQuant-MLX-4bit](https://huggingface.co/majentik/MiniMax-M2.7-RotorQuant-MLX-4bit) -- MLX 4-bit

 license_name: minimax-model-license
 license_link: https://huggingface.co/MiniMaxAI/MiniMax-M2.7/blob/main/LICENSE
 tags:
+  - minimax
+  - m2.7
+  - moe
+  - quantized
+  - rotorquant
+  - kv-cache-quantization
+  - mlx
 ---
 # MiniMax-M2.7-RotorQuant-MLX-5bit
 - [majentik/MiniMax-M2.7-TurboQuant-MLX-5bit](https://huggingface.co/majentik/MiniMax-M2.7-TurboQuant-MLX-5bit) -- TurboQuant MLX 5-bit
 - [majentik/MiniMax-M2.7-RotorQuant-MLX-8bit](https://huggingface.co/majentik/MiniMax-M2.7-RotorQuant-MLX-8bit) -- MLX 8-bit
 - [majentik/MiniMax-M2.7-RotorQuant-MLX-4bit](https://huggingface.co/majentik/MiniMax-M2.7-RotorQuant-MLX-4bit) -- MLX 4-bit
+## Quant trade-off (MLX lane)
+| Bits | Approx size | Use case | Recommendation |
+|---|---|---|---|
+| 2-bit | ~119 GB | Aggressive quantization | Very low-RAM Macs |
+| 3-bit | ~164 GB | Lossy but small | Low-RAM Macs |
+| 4-bit | ~192 GB | Balanced default | Recommended for most Macs |
+| **5-bit** | ~228 GB | Higher fidelity | **Quality-sensitive** |
+| 6-bit | ~274 GB | Approaching FP16 quality | High-fidelity |
+| 8-bit | ~347 GB | Near-lossless reference | Fidelity-critical work |
+(Current variant — **5bit** — is bolded.)
+## Variants in this family
+(Showing 12 sibling variants under `majentik/minimax-m2.7-*`. The current variant — `RotorQuant-MLX-5bit` — is **bolded**.)
+| Variant | Runtime | Approx size | Use case |
+|---|---|---|---|
+| [RotorQuant](https://huggingface.co/majentik/minimax-m2.7-rotorquant) | runtime modifier | n/a | KV-cache root (weight-agnostic) |
+| [RotorQuant-MLX-2bit](https://huggingface.co/majentik/minimax-m2.7-rotorquant-mlx-2bit) | mlx-lm | ~885 MB | Apple Silicon, smallest |
+| [RotorQuant-MLX-3bit](https://huggingface.co/majentik/minimax-m2.7-rotorquant-mlx-3bit) | mlx-lm | ~1.2 GB | Apple Silicon, small |
+| [RotorQuant-MLX-4bit](https://huggingface.co/majentik/minimax-m2.7-rotorquant-mlx-4bit) | mlx-lm | ~1.7 GB | Apple Silicon balanced |
+| **RotorQuant-MLX-5bit** | mlx-lm | ~2.1 GB | Apple Silicon, higher fidelity |
+| [RotorQuant-MLX-8bit](https://huggingface.co/majentik/minimax-m2.7-rotorquant-mlx-8bit) | mlx-lm | ~3.2 GB | Apple Silicon reference |
+| [TurboQuant](https://huggingface.co/majentik/minimax-m2.7-turboquant) | runtime modifier | n/a | KV-cache root (weight-agnostic) |
+| [TurboQuant-MLX-2bit](https://huggingface.co/majentik/minimax-m2.7-turboquant-mlx-2bit) | mlx-lm | ~885 MB | Apple Silicon, smallest |
+| [TurboQuant-MLX-3bit](https://huggingface.co/majentik/minimax-m2.7-turboquant-mlx-3bit) | mlx-lm | ~1.2 GB | Apple Silicon, small |
+| [TurboQuant-MLX-4bit](https://huggingface.co/majentik/minimax-m2.7-turboquant-mlx-4bit) | mlx-lm | ~1.7 GB | Apple Silicon balanced |
+| [TurboQuant-MLX-5bit](https://huggingface.co/majentik/minimax-m2.7-turboquant-mlx-5bit) | mlx-lm | ~2.1 GB | Apple Silicon, higher fidelity |
+| [TurboQuant-MLX-8bit](https://huggingface.co/majentik/minimax-m2.7-turboquant-mlx-8bit) | mlx-lm | ~3.2 GB | Apple Silicon reference |