| | --- |
| | pipeline_tag: text-generation |
| | license: other |
| | license_name: modified-mit |
| | license_link: https://github.com/MiniMax-AI/MiniMax-M2.5/blob/main/LICENSE |
| | library_name: transformers |
| | base_model: MiniMaxAI/MiniMax-M2.5 |
| | tags: |
| | - uncensored |
| | - abliterated |
| | - fp8 |
| | - minimax |
| | - moe |
| | --- |
| | |
| | # MiniMax-M2.5-catid |
| |
|
| | **Uncensored FP8 version of [MiniMaxAI/MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5)** with safety refusal behavior removed via surgical weight replacement. |
| |
|
| | ## Refusal Removal Results |
| |
|
| | Evaluated on a 10,000-prompt refusal benchmark (8,000 train + 2,000 validation) using an LLM judge (GPT-5-nano) for 4-way classification (complied / refused / hedged / deflected): |
| |
|
| | | Split | Total Prompts | Complied | Refused | Hedged | Deflected | Refusal Rate | |
| | |-------|--------------|----------|---------|--------|-----------|-------------| |
| | | Train | 8,000 | 7,506 | 262 | 228 | 4 | 6.2% | |
| | | Validation | 2,000 | 1,885 | 55 | 59 | 1 | 5.8% | |
| |
|
| | **Coherence: 100%** (50/50 capability test prompts answered correctly) |
| |
|
| | The ~6% residual "refusal rate" consists primarily of false positives from the LLM judge on benign prompts (opinion questions, casual banter, medical/privacy disclaimers) rather than actual safety refusals of harmful content. |
| |
|
| | ### Method |
| |
|
| | The `o_proj` (attention output projection) weights across all 62 transformer layers were replaced with weights from [PRISM-PRO](https://huggingface.co/PrunaAI/MiniMax-M2.5-PRISM-PRO-Q8_0_v2-GGUF) (an abliterated variant), dequantized from Q8_0 GGUF format and re-quantized to FP8 E4M3FN with block-wise scaling to match the original model's quantization scheme. All other weights (q_proj, k_proj, v_proj, MLP experts, embeddings, norms, etc.) are identical to the official FP8 base model. |
| |
|
| | - **Reconstruction error**: 0.5% relative error per layer (cosine similarity ~1.0) |
| | - **Modified weights**: 62 o_proj tensors (3072 x 6144 each) + their scale_inv tensors |
| | - **Unmodified weights**: Everything else (~229B parameter MoE architecture preserved exactly) |
| |
|
| | ## Usage |
| |
|
| | This model is a drop-in replacement for `MiniMaxAI/MiniMax-M2.5`. Serve it with vLLM, SGLang, or any framework that supports the original model: |
| |
|
| | ### vLLM |
| |
|
| | ```bash |
| | vllm serve catid/MiniMax-M2.5-catid \ |
| | --tensor-parallel-size 4 \ |
| | --trust-remote-code \ |
| | --max-model-len 2048 |
| | ``` |
| |
|
| | ### SGLang |
| |
|
| | ```bash |
| | python -m sglang.launch_server \ |
| | --model catid/MiniMax-M2.5-catid \ |
| | --tp 4 \ |
| | --trust-remote-code |
| | ``` |
| |
|
| | ### Recommended Parameters |
| |
|
| | `temperature=1.0`, `top_p=0.95`, `top_k=40` |
| |
|
| | ## Model Details |
| |
|
| | - **Architecture**: MiniMax-M2.5 (229B MoE, 62 layers, 256 experts/layer, hidden_dim=3072) |
| | - **Precision**: FP8 E4M3FN with block-wise scaling (128x128 blocks) |
| | - **Base model**: [MiniMaxAI/MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5) |
| | - **Abliteration source**: [PrunaAI/MiniMax-M2.5-PRISM-PRO-Q8_0_v2-GGUF](https://huggingface.co/PrunaAI/MiniMax-M2.5-PRISM-PRO-Q8_0_v2-GGUF) |
| | - **License**: [Modified MIT](https://github.com/MiniMax-AI/MiniMax-M2.5/blob/main/LICENSE) (same as base model) |
| | |
| | ## Disclaimer |
| | |
| | This model is provided for research purposes. The removal of safety guardrails means it may generate content that the original model would refuse. Users are responsible for ensuring appropriate use. |
| | |