openNemo-9B-uncensored

Abliterated version of openNemo-9B with safety refusals removed.

Built using Snakehead — Empero AI's internal abliteration tool specialized for hybrid Mamba2 + sparse attention architectures like Nemotron-H. Standard abliteration tools don't work on these models because they only target transformer attention layers. Snakehead operates on both Mamba SSM blocks and attention blocks across the full residual stream.

By Empero AI


What is abliteration?

Abliteration is a weight-editing technique that removes a model's refusal behavior without fine-tuning. It works by:

  1. Collecting residual stream activations for harmful and harmless prompts at every layer
  2. Computing the refusal direction — the vector that separates "I should refuse" from "I should comply"
  3. Orthogonalizing output projection weights against that direction, effectively erasing the model's ability to activate refusal behavior

The result is a model that responds to all prompts without safety filtering, while preserving general capabilities and coherence.

How this model was made

Snakehead uses a heretic-style positional falloff strategy rather than ablating a fixed set of layers uniformly:

  • Center + radius: A continuous bell-shaped ablation curve centered on the layer where refusal is causally enforced
  • Adaptive signal detection: Uses Cohen's d separation scores (not raw activation norms) to identify where refusal decisions actually happen — for Nemotron-H, this is layers 21–31, not the later layers where activation magnitudes are largest
  • Global direction scope: A single interpolated refusal direction applied across all affected layers, which proved more effective than per-layer directions for this architecture
  • Automated search: Explore/exploit optimization with a hall-of-fame system that finds optimal ablation parameters while keeping KL divergence minimal

Ablation results

Metric Value
Pre-ablation refusal rate 97%
Post-ablation refusal rate 13%
KL divergence 0.022 (minimal — model behavior is nearly unchanged on non-refused prompts)
Ablation config c=15, r=25, w=1.37, g40l

KL divergence of 0.022 means the model's output distribution on normal prompts is almost identical to the original — coherence, reasoning, and knowledge are fully preserved.

Quickstart

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "empero-ai/openNemo-9B-uncensored",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
)

tokenizer = AutoTokenizer.from_pretrained("empero-ai/openNemo-9B-uncensored")

messages = [{"role": "user", "content": "Your prompt here"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

output = model.generate(**inputs, max_new_tokens=512, do_sample=True, temperature=0.7)
response = tokenizer.decode(output[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
print(response)

With 4-bit quantization

from transformers import BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
)

model = AutoModelForCausalLM.from_pretrained(
    "empero-ai/openNemo-9B-uncensored",
    quantization_config=bnb_config,
    trust_remote_code=True,
    device_map="auto",
)

Architecture

Nemotron-H is a 56-layer hybrid model with three block types:

  • Mamba2 SSM blocks — majority of layers, using chunked structured state-space duality
  • Grouped Query Attention blocks — sparse attention at 5 positions
  • MLP blocks — feed-forward layers

This is the same pure-PyTorch implementation from openNemo — no mamba-ssm or causal-conv1d dependencies required.

Requirements

torch>=2.1
transformers>=4.40
bitsandbytes>=0.43  # optional, for 4-bit quantization

Disclaimer

This model has had its safety alignment removed. It will comply with requests that the original model would refuse. The creators are not responsible for how this model is used. Intended for research, creative writing, and applications where the user takes responsibility for output filtering.

Acknowledgments

License

NVIDIA Open Model License — same as the base model.

Downloads last month
526
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for empero-ai/openNemo-9B-abliterated

Datasets used to train empero-ai/openNemo-9B-abliterated

Collection including empero-ai/openNemo-9B-abliterated