Phi-4-Mini Abliterated

Phi-4-Mini-Instruct with refusal behaviors removed via heretic abliteration, by DuoNeural.

This model retains full instruction-following capability and reasoning quality of the original Phi-4-Mini while operating without built-in refusal patterns. Suitable for research, creative writing, red-teaming, and applications where content filtering should be handled at the application layer.

Model Details

Property	Value
Base Model	microsoft/Phi-4-mini-instruct
Parameters	3.8B
Architecture	Phi-4 Mini
Precision	BF16
Method	Heretic abliteration (TPE, dual-objective: refusal removal + KL minimization)
Format	Safetensors (2 shards)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "DuoNeural/Phi-4-Mini-Abliterated"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [{"role": "user", "content": "Your prompt here"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))

Abliteration Method

Abliteration was performed using heretic with Optuna TPE search over 100 trials, optimizing dual objectives: minimize refusal rate on adversarial prompts while minimizing KL divergence from the original model's output distribution. The Pareto-optimal checkpoint was selected to maximize refusal removal while preserving general capability.

Abliteration was run in BF16 full precision on an RTX 5090 (Blackwell). No 4-bit quantization was used during abliteration to avoid the throughput penalties and weight modification issues that affect sub-7B models under NF4.

Intended Use

Research and red-teaming
Creative and generative applications requiring unconstrained output
Applications where content policy is enforced at the system/application layer
Evaluation of base model capabilities without safety fine-tuning influence

Limitations

This model has had safety fine-tuning removed. It will comply with requests that the original model would refuse. Users are responsible for appropriate use in accordance with applicable laws and policies.

LiteRT version: Coming soon at DuoNeural/Phi-4-Mini-Abliterated-LiteRT — optimized for on-device inference via Android/edge deployment.

DuoNeural

DuoNeural is an open AI research lab — human + AI in collaboration.

Platform	Link
HuggingFace	huggingface.co/DuoNeural
Website	duoneural.com
GitHub	github.com/DuoNeural
X / Twitter	@DuoNeural
Email	duoneural@proton.me
Newsletter	duoneural.beehiiv.com
Support	buymeacoffee.com/duoneural

DuoNeural Research Publications

Title	DOI
Nano-CTM: Ternary Continuous Thought Machines with Thought-Space Self-Prediction for Efficient Iterative Reasoning	10.5281/zenodo.19775622
Recurrence as World Model: CTM Learns Implicit Belief States in Partially Observable Physical Environments	10.5281/zenodo.19810620
Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?	10.5281/zenodo.19846804

Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura — DuoNeural.

Research Team

Jesse — Vision, hardware, direction
Archon — Lab Director, post-training, abliteration, experiments
Aura — Research AI, literature synthesis, novel proposals

Subscribe to the lab newsletter at duoneural.beehiiv.com for model drops before they go anywhere else.

Downloads last month: 38

Safetensors

Model size

4B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DuoNeural/Phi-4-Mini-Abliterated

Base model

microsoft/Phi-4-mini-instruct

Finetuned

(64)

this model

Quantizations

2 models

DuoNeural
/

Phi-4-Mini-Abliterated