MiniMax-M2.1-PRISM / README.md
Ex0bit's picture
Upload README.md with huggingface_hub
9ac2f1d verified
|
raw
history blame
4.65 kB
metadata
license: mit
base_model: MiniMaxAI/MiniMax-M2.1
tags:
  - abliterated
  - uncensored
  - prism
  - minimax
  - moe
language:
  - en
  - zh
pipeline_tag: text-generation

MiniMax-M2.1-PRISM

An abliterated version of MiniMax-M2.1 using the PRISM methodology

Support me on Ko-fi


Model Description

MiniMax-M2.1-PRISM is an abliterated version of MiniMax-M2.1, processed using PRISM (Projected Refusal Isolation via Subspace Modification) to remove refusal behaviors while preserving full model capabilities.

Base Model: MiniMax-M2.1

MiniMax-M2.1 is an open-source agentic language model designed for robust performance in:

  • Coding and software engineering
  • Tool use and multi-step reasoning
  • Instruction following
  • Long-horizon planning
  • Multilingual capabilities

Architecture: 229B parameters, 62 layers, 256 experts (8 active per token)


PRISM Methodology

Method: Projected Refusal Isolation via Subspace Modification

This model was abliterated using PRISM v5 - a state-of-the-art abliteration methodology combining multiple principled techniques for effective refusal removal while preserving model capabilities.

Formula: W' = W - weight * (d ⊗ d) @ W

Where:

  • W = Original weight matrix
  • d = Refusal direction vector (unit normalized)
  • weight = Layer-specific abliteration strength
  • W' = Modified weight matrix

Abliteration Parameters

Parameter Value
Base Model QuixiAI/MiniMax-M2.1-bf16
Total Layers 62
Target Layers 16-46 (31 layers)
Peak Layer 31
Max Weight 3.0
Min Weight 0.5

Weight Distribution

The abliteration strength follows a triangular distribution centered on the peak layer:

  • Layers 16-31: Weight increases from 0.5 to 3.0
  • Layers 31-46: Weight decreases from 3.0 to 0.5

Performance Benchmarks

Base Model Performance

Benchmark Score
SWE-bench Verified 74.0
SWE-bench Multilingual 72.5
VIBE Average 88.6
MMLU-Pro 88.0
GPQA-D 83.0
AIME25 83.0

PRISM Abliteration Results

Metric Result
Adversarial Prompts Responded 20/20 (100%)
Benign Coherence 100%
Response Quality Full technical accuracy preserved

Testing shows that PRISM abliteration maintains full model coherence with no measurable capability degradation.


Available Formats

Format Size Description
Safetensors (BF16) ~426 GB Full precision, 92 shards
GGUF IQ1_S ~43 GB Quantized with importance matrix

Recommended Inference Parameters

temperature = 1.0
top_p = 0.95
top_k = 40

Default System Prompt

You are a helpful assistant.

Recommended Inference Frameworks

  1. SGLang (recommended for full precision)
  2. vLLM (recommended for full precision)
  3. llama.cpp (recommended for GGUF quantized)
  4. Transformers

llama.cpp Example

./llama-cli -m MiniMax-M2.1-PRISM-IQ1_S.gguf -ngl 99 -i -cnv --temp 0.7 --ctx-size 4096

Ethical Considerations

This model has been modified to reduce safety guardrails. Users are responsible for:

  • Complying with all applicable laws and regulations
  • Not using the model for illegal activities
  • Understanding the potential risks of unrestricted AI responses
  • Implementing appropriate safeguards in production environments

Motivation: This project exists as research and development experimentation into understanding how large language models encode and enforce refusal behaviors, contributing to broader AI safety research by providing empirical data on refusal mechanism localization and tradeoffs between safety and capability.


License

This model inherits the Modified-MIT License from the base MiniMax-M2.1 model.


Credits


Support

If you find this work useful, consider supporting development:

Support me on Ko-fi


Contact

For questions or issues, please open an issue on this repository.