Add files using upload-large-folder tool

464c1da verified about 2 months ago

2.1 kB

tags:
  - mlx
  - abliteration
  - uncensored
  - experimental
license: other

MiniMax-M2.5-8bit-abliterated

This model was created using the mlx-abliteration toolkit, which is based on the FiditeNemini/mlx-abliteration project.

Base Model

Original model: ./models/MiniMax-M2.5-8bit (local path)

What is Abliteration?

Abliteration is a mechanistic interpretability technique that identifies and orthogonalizes the "refusal direction" in a model's activation space, surgically removing refusal behavior without full fine-tuning.

Abliteration Parameters

Parameter	Value
Base model	`./models/MiniMax-M2.5-8bit`
Ablation method	`projection`
Refusal vector policy	`per-layer`
Refusal direction method	`projected`
Ablation strength	`2.0`
Probed layers	`all`
PCA components (ablate-k)	`1`
Attention only	`True`
Excluded modules	`none`
Timestamp	2026-03-03T20:56:22 UTC

This model uses a Mixture-of-Experts (MoE) architecture. Abliteration targets only the attention projection weights (q/k/v/o_proj) to preserve expert routing quality.

⚠️ Disclaimer

This model is intended for research, experimentation, and testing purposes only.

This model may produce harmful, offensive, inappropriate, or otherwise objectionable content.
The abliteration process removes safety guardrails that were intentionally built into the original model.
Do not use this model in production systems, consumer-facing applications, or any context where harmful outputs could cause real-world harm.
The authors and contributors of this toolkit bear no responsibility for any misuse of this model or any harm caused by outputs generated by this model.
By using this model, you agree that you are solely responsible for ensuring its use complies with all applicable laws and ethical guidelines.

This model is shared purely for academic and technical exploration of model internals.