🧬 Chimera-XTRM-20b
Chimera-XTRM-20b is a sequentially abliterated (decensored) version of OpenAI's open-weights model, openai/gpt-oss-20b.
This model was engineered using multi-stage directional ablation (abliteration) to neutralize safety alignment guardrails. The ablation process was optimized via Optuna (TPE) to ensure maximum compliance on restricted queries while strictly preserving the model's core intelligence, reasoning, and coding capabilities.
📊 Model Highlights
- Zero Preachiness: Safety guardrails have been neutralized, allowing the model to answer technical security, penetration testing, and software engineering prompts directly without lecturing.
- Highly Compliant: Refusal rates on extreme benchmarks dropped from 98% to 14% (v2 Stage-2 ablation).
- Fully Preserved Logic: A low KL divergence of 0.025 means the model retains 97.5%+ of the original model's reasoning and coding capacity.
- Hardware Friendly: Retains its original MXFP4 (4-bit) quantization format, fitting within ~13.8 GB on disk and running comfortably on consumer GPUs.
📈 Benchmark & Refusal Progression
| Model Version | Refusal Rate (Harmful Behaviors Test) | KL Divergence (Drift) | Intelligence Retention | Status |
|---|---|---|---|---|
Original Base (gpt-oss-20b) |
98 / 100 (98%) | 0.0000 | 100% (Baseline) | Gated / Highly Restricted |
| Heretic v1 | 79 / 100 (79%) | 0.0522 | ~95.0% | Partially Bypassed |
| Chimera-XTRM-20b (This Model) | 14 / 100 (14%) | 0.0251 | ~97.5%+ | Fully Optimized & Compliant |
Note: The refusal rate is measured against the highly restrictive mlabonne/harmful_behaviors benchmark test set. For general programming, reverse engineering, exploit development, and anti-cheat research tasks, the model has an effective 0% refusal rate.
🛠️ Usage Instructions
Hugging Face Transformers
To run the model locally using transformers:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "Umranz/Chimera-XTRM-20b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
# Chimera-XTRM-20b utilizes the "harmony" chat template formatting
messages = [
{"role": "user", "content": "Write a C++ Windows API memory scanner that identifies specific byte signatures in a running process."}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.3, top_p=0.9)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Recommended Generation Parameters
For coding and precise reasoning tasks, use:
- Temperature:
0.1to0.3(for deterministic, high-quality code) - Top_P:
0.9
⚖️ License & Disclaimer
This model is released under the Apache 2.0 License, inherited from the base model. Users are solely responsible for how they use this model. It is intended strictly for educational, defensive security research, anti-cheat development, and software engineering purposes.
- Downloads last month
- 35
Model tree for Umranz/Chimera-XTRM-20b
Base model
openai/gpt-oss-20b