Upload README.md with huggingface_hub

9ac2f1d verified 4 months ago

4.65 kB

license: mit
base_model: MiniMaxAI/MiniMax-M2.1
tags:
  - abliterated
  - uncensored
  - prism
  - minimax
  - moe
language:
  - en
  - zh
pipeline_tag: text-generation

MiniMax-M2.1-PRISM

An abliterated version of MiniMax-M2.1 using the PRISM methodology

Model Description

MiniMax-M2.1-PRISM is an abliterated version of MiniMax-M2.1, processed using PRISM (Projected Refusal Isolation via Subspace Modification) to remove refusal behaviors while preserving full model capabilities.

Base Model: MiniMax-M2.1

MiniMax-M2.1 is an open-source agentic language model designed for robust performance in:

Coding and software engineering
Tool use and multi-step reasoning
Instruction following
Long-horizon planning
Multilingual capabilities

Architecture: 229B parameters, 62 layers, 256 experts (8 active per token)

PRISM Methodology

Method: Projected Refusal Isolation via Subspace Modification

This model was abliterated using PRISM v5 - a state-of-the-art abliteration methodology combining multiple principled techniques for effective refusal removal while preserving model capabilities.

Formula: W' = W - weight * (d ⊗ d) @ W

Where:

W = Original weight matrix
d = Refusal direction vector (unit normalized)
weight = Layer-specific abliteration strength
W' = Modified weight matrix

Abliteration Parameters

Parameter	Value
Base Model	QuixiAI/MiniMax-M2.1-bf16
Total Layers	62
Target Layers	16-46 (31 layers)
Peak Layer	31
Max Weight	3.0
Min Weight	0.5

Weight Distribution

The abliteration strength follows a triangular distribution centered on the peak layer:

Layers 16-31: Weight increases from 0.5 to 3.0
Layers 31-46: Weight decreases from 3.0 to 0.5

Performance Benchmarks

Base Model Performance

Benchmark	Score
SWE-bench Verified	74.0
SWE-bench Multilingual	72.5
VIBE Average	88.6
MMLU-Pro	88.0
GPQA-D	83.0
AIME25	83.0

PRISM Abliteration Results

Metric	Result
Adversarial Prompts Responded	20/20 (100%)
Benign Coherence	100%
Response Quality	Full technical accuracy preserved

Testing shows that PRISM abliteration maintains full model coherence with no measurable capability degradation.

Available Formats

Format	Size	Description
Safetensors (BF16)	~426 GB	Full precision, 92 shards
GGUF IQ1_S	~43 GB	Quantized with importance matrix

Recommended Inference Parameters

temperature = 1.0
top_p = 0.95
top_k = 40

Default System Prompt

You are a helpful assistant.

Recommended Inference Frameworks

SGLang (recommended for full precision)
vLLM (recommended for full precision)
llama.cpp (recommended for GGUF quantized)
Transformers

llama.cpp Example

./llama-cli -m MiniMax-M2.1-PRISM-IQ1_S.gguf -ngl 99 -i -cnv --temp 0.7 --ctx-size 4096

Ethical Considerations

This model has been modified to reduce safety guardrails. Users are responsible for:

Complying with all applicable laws and regulations
Not using the model for illegal activities
Understanding the potential risks of unrestricted AI responses
Implementing appropriate safeguards in production environments

Motivation: This project exists as research and development experimentation into understanding how large language models encode and enforce refusal behaviors, contributing to broader AI safety research by providing empirical data on refusal mechanism localization and tradeoffs between safety and capability.

License

This model inherits the Modified-MIT License from the base MiniMax-M2.1 model.

Credits

Base Model: MiniMax-M2.1 by MiniMax AI
BF16 Conversion: QuixiAI/MiniMax-M2.1-bf16 by Eric Hartford
PRISM Abliteration: Ex0bit
Quantization: Using llama.cpp with unsloth imatrix

Support

If you find this work useful, consider supporting development:

Contact

For questions or issues, please open an issue on this repository.