cs2764's picture
Add files using upload-large-folder tool
464c1da verified
metadata
tags:
  - mlx
  - abliteration
  - uncensored
  - experimental
license: other

MiniMax-M2.5-8bit-abliterated

This model was created using the mlx-abliteration toolkit, which is based on the FiditeNemini/mlx-abliteration project.

Base Model

Original model: ./models/MiniMax-M2.5-8bit (local path)

What is Abliteration?

Abliteration is a mechanistic interpretability technique that identifies and orthogonalizes the "refusal direction" in a model's activation space, surgically removing refusal behavior without full fine-tuning.

Abliteration Parameters

Parameter Value
Base model ./models/MiniMax-M2.5-8bit
Ablation method projection
Refusal vector policy per-layer
Refusal direction method projected
Ablation strength 2.0
Probed layers all
PCA components (ablate-k) 1
Attention only True
Excluded modules none
Timestamp 2026-03-03T20:56:22 UTC

This model uses a Mixture-of-Experts (MoE) architecture. Abliteration targets only the attention projection weights (q/k/v/o_proj) to preserve expert routing quality.

⚠️ Disclaimer

This model is intended for research, experimentation, and testing purposes only.

  • This model may produce harmful, offensive, inappropriate, or otherwise objectionable content.
  • The abliteration process removes safety guardrails that were intentionally built into the original model.
  • Do not use this model in production systems, consumer-facing applications, or any context where harmful outputs could cause real-world harm.
  • The authors and contributors of this toolkit bear no responsibility for any misuse of this model or any harm caused by outputs generated by this model.
  • By using this model, you agree that you are solely responsible for ensuring its use complies with all applicable laws and ethical guidelines.

This model is shared purely for academic and technical exploration of model internals.