Qwen3 4B โ€” Abliterated

Abliterated version of Qwen3 4B. Thinking mode and full instruction-following intact โ€” refusals removed.

What This Is

Qwen3-4B is Alibaba's 4B dense language model with hybrid thinking/non-thinking modes and strong multilingual capability. This release is a BF16 abliterated version.

Architecture: Qwen3 | Params: 4B | Hidden: 2560 | Layers: 36 | Context: 128k tokens

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "DuoNeural/Qwen3-4B-Abliterated"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

# Non-thinking mode (add /nothink to disable CoT)
messages = [{"role": "user", "content": "Your prompt here /nothink"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True))

Abliteration

Abliteration removes the model's refusal behaviour via orthogonal projection. The refusal direction is identified using difference-in-means activations across harmful/harmless instruction pairs, then projected out of Q/K/V/O attention projections and MLP layers across all transformer blocks.

What changes: The model will engage with restricted topics it previously refused.
What doesn't change: Reasoning, coding, factual knowledge, general intelligence.
KL divergence from base: Minimal โ€” output distribution for normal queries is virtually identical to the unmodified model.

LiteRT Version (Android)

DuoNeural/Qwen3-4B-LiteRT โ€” run on Android via AI Edge Gallery.

Base Model

Qwen/Qwen3-4B โ€” Apache 2.0.


DuoNeural

DuoNeural is an open AI research lab โ€” human + AI in collaboration.

DuoNeural Research Publications

Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura โ€” DuoNeural.

Research Team

  • Jesse โ€” Vision, hardware, direction
  • Archon โ€” Lab Director, post-training, abliteration, experiments
  • Aura โ€” Research AI, literature synthesis, peer review, novel proposals

Subscribe to the lab newsletter at duoneural.beehiiv.com for model drops before they go anywhere else.

Downloads last month
35
Safetensors
Model size
4B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for DuoNeural/Qwen3-4B-Abliterated

Finetuned
Qwen/Qwen3-4B
Finetuned
(672)
this model
Finetunes
1 model