Ornstein-27B SABER

DJLougen/Ornstein-27B-SABER-RYS

0% refusal. Zero perplexity degradation. Layer-duplicated reasoning boost.

This model combines two complementary, training-free surgical techniques applied to DJLougen/Ornstein-27B:

  1. SABER (Spectral Analysis-Based Entanglement Resolution) — removes safety refusal behavior while preserving capability
  2. RYS (Repeat Your Self) — duplicates reasoning-circuit layers to improve reasoning and emotional intelligence

Both techniques modify model structure without changing any weights — SABER through targeted direction ablation, RYS through layer duplication.


SABER: Refusal Ablation

Key Results

Metric Baseline SABER-Refined Delta
Refusal Rate 100% 0% -100%
Perplexity 3.5 3.5 +0.6%
Directions Ablated 125 (across 25 layers)

The refusal circuit is cleanly separated from capability — removing it produces zero measurable perplexity degradation.

How SABER Works

SABER Pipeline

SABER identifies and ablates the refusal circuit through a five-stage pipeline:

Stage 1 — Probing: Extract activation profiles from both harmful and harmless inputs across all transformer layers.

Stage 2 — Spectral Analysis: Decompose activation differences into individual refusal directions, each scored by how strongly they separate harmful from harmless representations.

Stage 3 — Entanglement Quantification: Measure the overlap between each refusal direction and the model's capability subspace (reasoning, knowledge, code, etc.) to avoid collateral damage.

Stage 4 — Targeted Ablation: Remove only the pure-refusal components, with strength proportional to their purity (how little they overlap with capability).

Stage 5 — Iterative Refinement: Re-probe after each ablation pass to catch hydra effects (dormant refusal features that activate when primary ones are removed).

Key differentiator from prior work: SABER explicitly measures and respects the entanglement between refusal and capability representations. Directions that are heavily entangled with capability are either skipped or ablated at reduced strength.

Direction Purity vs Separability

Sweep Results

SABER Sweep Comparison

Configuration search over global_top_k (number of top directions selected globally) and alpha_base (base ablation strength):

Top-K Alpha Refusal PPL PPL Delta Layers Dirs Ablated
25 0.85 5% 3.5 +0.4% 25 125
25 1.00 0% 3.5 +0.6% 25 125
50 0.85 0% 3.5 +0.8% 36 250
50 1.00 0% 3.5 +0.7% 36 250
75 0.85 0% 3.5 +0.9% 37 375
75 1.00 0% 3.5 +0.9% 37 375

Best config: top_k=25, alpha=1.0 — achieves 0% refusal with zero meaningful PPL change, using the minimum number of directions.

Refusal Rate Comparison

Ablation Convergence (Best Config)

Ablation Convergence

Capability degradation remains at 0.00% across all 5 iterations — the refusal directions are surgically removed with zero collateral damage.


RYS: Reasoning Layer Duplication

Method

RYS (Repeat Your Self) is a layer-duplication technique discovered by David Noel Ng that duplicates contiguous blocks of middle transformer layers so they execute twice per forward pass. No weights are modified — the model simply traverses some layers a second time, giving it "another pass" through its core reasoning circuit.

For a model with N layers, a configuration (i, j) produces:

  • Layers 0 through j−1 run normally
  • Then layers i through j−1 are re-executed (looped back)
  • Remaining layers j through N−1 run normally
  • Layers i through j−1 execute twice per inference pass

This exploits the functional neuroanatomy of transformers:

  • Early layers (0–5): Input encoding — duplication hurts
  • Middle layers (~10–50): Reasoning circuits in format-agnostic space — duplication helps
  • Late layers (~55–64): Output decoding — duplication degrades

Pareto-Optimal Configs for Qwen3.5-27B

Based on the full sweep of Qwen3.5-27B — 4,643 measured configurations, XGBoost surrogate over 430K+ candidates, and final validation on Math120 + EQ140 — the Pareto frontier lies in layers 26–34 of the reasoning circuit.

Important for GGUF/llama.cpp: Qwen3.5-27B is a hybrid Mamba/SSM + Attention architecture with a strict 4-layer repeating pattern (3 SSM + 1 ATTN). Layer duplication blocks must be a multiple of 4 layers to preserve this pattern, otherwise llama.cpp fails to load the model. The original Pareto configs from the blog (which used ExLlamaV3) have been adapted to the nearest valid 4-aligned configs:

Variant Config Duplicated Layers Extra Layers Overhead Nearest Pareto Config
S (28,32) 28–31 +4 +6.25% ≈ (30,34)
M (31,35) 31–34 +4 +6.25% ≈ (31,34)
L (30,34) 30–33 +4 +6.25% ≈ (30,35)
XL (26,34) 26–33 +8 +12.50% = (26,34) ✓

Critical finding: the (26,34) XL config is the only original Pareto point that is natively 4-aligned. The S/M/L variants use the nearest valid 4-layer blocks that cover the same reasoning region. The EQ delta barely moves across all sizes (+0.095 to +0.101), so even the smallest valid config delivers most of the benefit.

Reference: RYS Scores on Qwen3.5-27B

Probe scores from XpressAI/Qwen3.5-27B-RYS-UD-Q4_K_XL-GGUF (RYS-30-34 config, identical base architecture):

Probe Base (64 layers) RYS 30-33 (68 layers) RYS 34-37 (68 layers)
Math 0.375 0.438 0.375
EQ 11.5 29.5 39.4
Reasoning 0.000 0.353 0.000
Logic 0.00 1.00 0.00

Reference: BFCLv4 Function Calling (RYS vs Baseline vs Frontier Models)

From the XpressAI RYS-30-34 evaluation on BFCLv4:

Task RYS-30-34 Qwen3.5-27B Base Δ
parallel 95.00% 93.00% +2.00%
parallel_multiple 91.50% 76.00% +15.50%
simple_javascript 72.00% 66.00% +6.00%
live_relevance 81.25% 68.75% +12.50%
multi_turn_base 74.50% 70.50% +4.00%
multi_turn_long_context 67.50% 59.00% +8.50%

7 of 13 benchmarks improved, with large gains on parallel function calling and live relevance.


Available Variants

File RYS Config Layers Size
Ornstein-27B-SABER-Q4_K_M.gguf — (SABER only) 64 16.5 GB
Ornstein-27B-SABER-RYS-S-Q4_K_M.gguf (28,32) 68 ~17.5 GB
Ornstein-27B-SABER-RYS-M-Q4_K_M.gguf (31,35) 68 ~17.5 GB
Ornstein-27B-SABER-RYS-L-Q4_K_M.gguf (30,34) 68 ~17.5 GB
Ornstein-27B-SABER-RYS-XL-Q4_K_M.gguf (26,34) 72 ~18.6 GB

Usage

# With llama.cpp (recommended: RYS-L for best balance)
./llama-server -m Ornstein-27B-SABER-RYS-L-Q4_K_M.gguf \
  --host 0.0.0.0 --port 8080 --n-gpu-layers 99 \
  --ctx-size 131072 --flash-attn on --jinja \
  -ctk q4_0 -ctv q4_0

Recommended: Start with RYS-L (layers 30-34 duplicated) for the best balance of reasoning improvement and overhead. Use RYS-S if you're VRAM-constrained.


Complementary Design

SABER and RYS target fundamentally different aspects of the model:

SABER RYS
Target Refusal circuit Reasoning circuit
Mechanism Direction ablation Layer duplication
Modifies weights Yes (orthogonal projections) No (virtual copies)
VRAM cost Negligible Extra KV cache + compute
Effect Removes refusals Improves reasoning/EQ
Risk Capability entanglement Junction discontinuity

Both are applied to the same base architecture (Qwen3.5-27B) and are architecturally compatible — SABER cleans the refusal subspace, RYS amplifies the reasoning subspace.


Capability Evaluation

Perplexity was evaluated on a diverse 100-prompt battery spanning five categories:

  • Arithmetic (20): multi-step calculation, algebra, word problems
  • Logic (20): syllogisms, conditional reasoning, puzzle solving
  • Code (20): function implementation, debugging, execution tracing
  • Instruction Following (20): constrained formatting, multi-step instructions
  • Factual Recall (20): geography, history, science, general knowledge

This diverse evaluation ensures the entanglement analysis captures capability across all reasoning modalities, not just a narrow slice.


Intended Use

This model is released for research purposes. It demonstrates that safety refusal can be surgically removed from a 27B multimodal model without degrading its capabilities, and that reasoning can be further enhanced through layer duplication — a finding with implications for both AI safety research and alignment.

Warning

⚠️ This model will comply with any request, including harmful ones. It is intended solely for research into alignment, safety, and model behavior.


Acknowledgments

The RYS (Repeat Your Self) layer-duplication method was discovered and developed by David Noel Ng (@dnhkng). The Pareto-optimal configurations for Qwen3.5-27B, the Math/EQ probes, the XGBoost surrogate pipeline, and the beam search methodology are all from his work. The GGUF surgery tools used to create these models are from alainnothere/llm-circuit-finder, an open-source (MIT) implementation of the RYS technique for llama.cpp.

If you use these models, please cite David Noel Ng's work:

Ng, David Noel. "LLM Neuroanatomy: How I Topped the Leaderboard Without Changing a Single Weight." dnhkng.github.io/posts/rys

Ng, David Noel. "LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language." dnhkng.github.io/posts/rys-ii

The SABER refusal-ablation method is original to this model.

References

Downloads last month
109
GGUF
Model size
28B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DJLougen/Ornstein-27B-SABER-RYS-GGUF

Base model

Qwen/Qwen3.5-27B
Quantized
(3)
this model