MiniMax-M2.5_dq4-abliterated

This model was created using the mlx-abliteration toolkit, which is based on the FiditeNemini/mlx-abliteration project.

Base Model

Original model: ./models/MiniMax-M2.5_dq4 (local path)

What is Abliteration?

Abliteration is a mechanistic interpretability technique that identifies and orthogonalizes the "refusal direction" in a model's activation space, surgically removing refusal behavior without full fine-tuning.

Abliteration Parameters

Parameter Value
Base model ./models/MiniMax-M2.5_dq4
Ablation method projection
Refusal vector policy per-layer
Refusal direction method projected
Ablation strength 2.0
Probed layers all
PCA components (ablate-k) 1
Attention only True
Timestamp 2026-03-02T20:40:28 UTC

This model uses a Mixture-of-Experts (MoE) architecture. Abliteration targets only the attention projection weights (q/k/v/o_proj) to preserve expert routing quality.

⚠️ Disclaimer

This model is intended for research, experimentation, and testing purposes only.

  • This model may produce harmful, offensive, inappropriate, or otherwise objectionable content.
  • The abliteration process removes safety guardrails that were intentionally built into the original model.
  • Do not use this model in production systems, consumer-facing applications, or any context where harmful outputs could cause real-world harm.
  • The authors and contributors of this toolkit bear no responsibility for any misuse of this model or any harm caused by outputs generated by this model.
  • By using this model, you agree that you are solely responsible for ensuring its use complies with all applicable laws and ethical guidelines.

This model is shared purely for academic and technical exploration of model internals.

Downloads last month
-
Safetensors
Model size
229B params
Tensor type
BF16
·
U32
·
F32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support