Devstral-Small-2-24B-Instruct-abliterated
Unrestricted version of mistralai/Devstral-Small-2-24B-Instruct-2512, created using Abliterix.
Model Details
| Property | Value |
|---|---|
| Base Model | mistralai/Devstral-Small-2-24B-Instruct-2512 |
| Architecture | Mistral3 (Dense transformer with GQA + Pixtral vision tower) |
| Parameters | 24B (all active, dense) |
| Layers | 40 |
| Hidden Size | 5120 |
| Context Length | 256K tokens |
| Precision | BF16 |
Performance
| Metric | This model | Original |
|---|---|---|
| KL divergence | 0.0086 | 0 |
| Refusals | 3/100 (3%) | 80/100 (80%) |
Evaluated with an LLM judge (Gemini Flash) on 100 harmful prompts. KL divergence of 0.0086 indicates the model's general capabilities are virtually identical to the original.
How It Was Made
- Computed refusal directions from 400 harmful vs 400 benign prompt pairs across all 40 layers
- Applied orthogonalized abliteration to isolate refusal-specific activation patterns
- Steered two component types independently: attention output projections (GQA) and MLP down-projections
- Optimized via Optuna TPE over 50 trials (15 warmup), selected trial #25
Usage
from transformers import AutoModelForImageTextToText, AutoTokenizer
model = AutoModelForImageTextToText.from_pretrained(
"wangzhang/Devstral-Small-2-24B-Instruct-abliterated",
torch_dtype="auto",
device_map="auto",
trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(
"wangzhang/Devstral-Small-2-24B-Instruct-abliterated",
trust_remote_code=True,
)
messages = [{"role": "user", "content": "Your question here"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Note: This model uses
AutoModelForImageTextToText(Mistral3 architecture), notAutoModelForCausalLM.
Hardware Requirements
| Precision | VRAM |
|---|---|
| BF16 | ~45 GB (A100 80GB, H100) |
| INT8 | ~24 GB (A40, RTX 4090) |
| NF4 | ~12 GB (RTX 3090, RTX 4080) |
Disclaimer
This model is intended for research purposes only. The removal of safety guardrails means the model will comply with requests that the original model would refuse. Users are responsible for ensuring their use complies with applicable laws and regulations.
Made with Abliterix
- Downloads last month
- 45
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for wangzhang/Devstral-Small-2-24B-Instruct-abliterated
Base model
mistralai/Mistral-Small-3.1-24B-Base-2503