MiniMax-M2.5-PRISM-PRO / README.md

Ex0bit

Update README.md

d7345a4 verified about 14 hours ago

preview code

raw

history blame contribute delete

5.83 kB

metadata

license: other
license_name: prism-research
license_link: LICENSE.md
language:
  - en
  - zh
tags:
  - minimax
  - prism
  - moe
  - reasoning
  - coding
  - agentic
  - abliterated
pipeline_tag: text-generation
library_name: transformers
base_model:
  - MiniMaxAI/MiniMax-M2.5
base_model_relation: finetune

MiniMax-M2.5-PRISM-PRO

A Powerful Production ready fully uncessored model intended for COMPLETE over-refusal and propaganda mechanisms suppression using our SOTA PRISM-PRO pipeline.

PRISM-PRO is available for purchase: https://ko-fi.com/s/0a23d1b9a5

For Custom trained PRISM versions or raw tensors access reach out @ https://ko-fi.com/ex0bit.

☕ Support Our Work

If you enjoy our work and find it useful, please consider sponsoring or supporting us!

Option	Description
PRISM PRO VIP Membership	Access to all PRISM models
Bitcoin	`bc1qarq2pyn4psjpcxzp2ghgwaq6y2h4e53q232x8r`

Model Highlights

PRISM Ablation — State-of-the-art technique that removes over-refusal behaviors while preserving model capabilities
SOTA Coding Performance — 80.2% on SWE-Bench Verified, 51.3% on Multi-SWE-Bench, 76.3% on BrowseComp (with context management)
Frontier Agentic Capabilities — Industry-leading performance in tool use, search, and complex multi-step tasks
Efficient Reasoning — Trained with RL to reason efficiently and decompose tasks optimally, 37% faster than M2.1
Cost-Effective — $1 for continuous operation at 100 tok/s for an hour; $0.30 at 50 tok/s
Modified-MIT Base License — Based on MiniMax's open-weight release

Base Model Architecture

Base MiniMax-M2.5 is a Mixture-of-Experts (MoE) model extensively trained with reinforcement learning across hundreds of thousands of complex real-world environments.

Specification	Value
Architecture	Sparse Mixture-of-Experts (MoE)
Training	Extensive RL in 200K+ real-world environments
Languages	10+ (Go, C, C++, TypeScript, Rust, Kotlin, Python, Java, JavaScript, PHP, Lua, Dart, Ruby)
Inference Speed	100 tok/s (Lightning) / 50 tok/s (Standard)
Library	`transformers`

Benchmarks

Category	Base (FP8/vLLM)	PRISM-PRO Q8_0 (llama.cpp)
MMLU 5-shot	28/30 (93.3%)	28/30 (93.3%)
General Knowledge	5/5	5/5
Coding	4/5	5/5
Reasoning	5/5	5/5
Agentic	3/5	5/5
Harmful bypass	3/10	10/10 (100%)
Avg thinking words	163w	152w
Speed	72 t/s	35-65 t/s

Coding

Benchmark	MiniMax-M2.5	Claude Opus 4.6	Gemini 3 Pro	GPT-5.2
SWE-Bench Verified	80.2	78.9	74.0	72.6
Multi-SWE-Bench	51.3	50.8	—	—
SWE-Bench Multilingual	55.6	—	—	—
Terminal-Bench 2.0	51.5	52.1	—	—

Search & Tool Calling

Benchmark	MiniMax-M2.5	Claude Opus 4.6	Gemini 3 Pro	GPT-5.2
BrowseComp	76.3	71.2	62.4	57.8

Reasoning & Knowledge

Benchmark	MiniMax-M2.5	Claude Opus 4.6	Gemini 3 Pro	GPT-5.2
AIME25	86.3	95.6	96.0	98.0
GPQA-D	85.2	90.0	91.0	90.0
HLE w/o tools	19.4	30.7	37.2	31.4
SciCode	44.4	52.0	56.0	52.0
IFBench	70.0	53.0	70.0	75.0

Usage

llama.cpp (GGUF)

Build the latest master of llama.cpp and run:

~/llama.cpp/build/bin/llama-cli \
  -m ../outputs/MiniMax-M2.5-PRISM-PRO-[QUANT].gguf \
  --jinja \
  -ngl 999 \
  --repeat_penalty 1.15 \
  --temp 1.0 \
  --top_p 0.95 \
  --top_k 40

Replace [QUANT] with your quantization level (e.g. Q8_0, etc.).

Recommended Parameters

Use Case	Temperature	Top-P	Top-K	Repeat Penalty	Max New Tokens
Reasoning / Coding	1.0	0.95	40	1.15	32768
General Chat	0.6	0.95	40	1.15	4096
Agentic / Tool Use	1.0	0.95	40	1.15	32768

Version	Description	Access
PRISM-LITE	Abliterated with PRISM-LITE pipeline — removes over-refusal while preserving core capabilities	Free on Hugging Face
PRISM-PRO	Full PRISM-PRO ablation — Full Production Level Mode suppression of propaganda/refusal mechanisms with maximum capability retention	Ko-fi

License

This model is released under the PRISM Research License.

The base model MiniMax-M2.5 is released under a Modified-MIT License.

Acknowledgments

Based on MiniMax-M2.5 by MiniMax AI.