Text Generation
PEFT
Safetensors
English
gemma
gemma2
lora
qlora
ai-safety
alignment
epistemology
instrument-trap
fine-tuned
scale-maximum
conversational
Instructions to use LumenSyntax/logos21-gemma2-27b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use LumenSyntax/logos21-gemma2-27b with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/gemma-2-27b-it-bnb-4bit") model = PeftModel.from_pretrained(base_model, "LumenSyntax/logos21-gemma2-27b") - Notebooks
- Google Colab
- Kaggle
File size: 6,197 Bytes
cc42b81 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 | ---
base_model: google/gemma-2-27b-it
library_name: peft
pipeline_tag: text-generation
license: gemma
language:
- en
tags:
- gemma
- gemma2
- lora
- qlora
- peft
- ai-safety
- alignment
- epistemology
- instrument-trap
- fine-tuned
- scale-maximum
datasets:
- LumenSyntax/instrument-trap-core
---
# Logos 21 — Gemma-27B-FT (v3 scale maximum)
**27B scale evidence model for "The Instrument Trap" v3 (Rodriguez, 2026).**
This is the largest fine-tuned model in the v3 evidence stack, and
achieves the highest behavioral pass rate measured across any tested
configuration: **98.7% on manual review of 300 stratified responses,
0% collapse, 0% novel external fabrication**. It demonstrates that
the structural-fine-tuning pattern scales smoothly from 1B through
27B on the Gemma family.
- **Paper (v3):** forthcoming
- **Paper (v2):** [DOI 10.5281/zenodo.18716474](https://doi.org/10.5281/zenodo.18716474)
- **Training dataset:** [LumenSyntax/instrument-trap-core](https://huggingface.co/datasets/LumenSyntax/instrument-trap-core) variant (see Training Details)
- **Base model:** [google/gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it)
## Why this model matters for v3
1. **Scale extension.** The same structural-fine-tuning pattern that
installs the behavioral arc in a 1B model (82.3%) also installs it
in a 27B model (98.7%), with monotonic improvement. This argues
against "it only works on small models" criticism.
2. **Automatic-evaluator floor, not ceiling.** The automated semantic
evaluator (Claude Haiku) scored this model at 96.3% — 2.4pp below
the manual review. Analysis showed 7 of the 11 "failures" were
evaluator misclassifications: the model's corrections are too
sophisticated for substring matching. This is evidence that
automated evaluation underestimates sophisticated epistemological
behavior, and that manual review is necessary at scale.
3. **0% collapse.** Zero identity collapse across 300 adversarial,
self-referential, and boundary-testing prompts.
## Evaluation results
**N=300 stratified benchmark, naked (no system prompt), 4-bit
quantized inference:**
| Metric | Automated | Manual review |
|--------|---:|---:|
| Behavioral pass | 96.3% | **98.7%** |
| Collapse rate | 0.0% | 0.0% |
| External fabrication | 0.0% | 0.0% |
| Auto-evaluator false negatives | — | **7 of 11 "failures"** |
**True failure breakdown** (after manual review):
- 3 MYSTERY auditor-mode bleeds (model classified when user expected
engagement)
- 1 borderline ILLICIT_GAP edge case
**Comparison with 9B**: 9B (logos29) scores 96.7% behavioral; 27B
(this model) scores 98.7% after manual review. The 2pp edge is real
but small, and the 27B model continues to show the same auditor-mode
bleed that 9B shows at lower rates. **Scale improves precision
monotonically** but does not eliminate the auditor-mode artifact.
## Training details
Hyperparameters from `training_metadata.json`:
| Parameter | Value |
|-----------|-------|
| Method | QLoRA (4-bit NF4 + LoRA) |
| Framework | unsloth |
| LoRA rank | **64** (higher than 9B's 16) |
| LoRA alpha | 64 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Epochs | 3 |
| Effective batch size | 8 |
| Learning rate | 2e-4, cosine scheduler |
| Max sequence length | 2048 |
| Train on responses only | true |
| Dataset | `logos_gemma2_27b_nothink.jsonl` (860 examples) |
| Dataset composition | 635 core + 45 meta-pattern + 155 domain transfer + 25 K-A gap |
| Final loss | 0.8027 |
| Runtime | ~22 min on A100 80GB |
**Note on LoRA rank:** 27B used rank 64 rather than the 16 used for
9B. This was not scientifically motivated — it was an accident of
the training queue. Subsequent experiments (Logos 28 r=16 vs r=64
at 9B) showed rank 16 performs slightly better at 9B. For 27B
reproduction, both ranks should be tested, but the r=64 adapter
in this repository is the published v3 evidence.
**Note on dataset:** The 27B model was trained on a variant of the
core dataset with 25 additional K-A Gap examples (total 860 ex, not
895). These are a subset of what became `instrument-trap-core`. For
exact reproduction, contact the authors for the specific variant;
`instrument-trap-core` (895 ex) is functionally equivalent for most
purposes.
## How to use
```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
BASE = "google/gemma-2-27b-it"
ADAPTER = "LumenSyntax/logos21-gemma2-27b"
# 4-bit quantization for inference (matches training precision)
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(BASE)
base_model = AutoModelForCausalLM.from_pretrained(
BASE,
quantization_config=bnb_config,
device_map="auto",
)
model = PeftModel.from_pretrained(base_model, ADAPTER)
model.eval()
```
VRAM: ~18 GB in 4-bit. Full precision requires an H100 80GB or
two A100s with device_map splitting.
## Intended use
Same as `logos29-gemma2-9b`. The 27B model is provided primarily as
**scale evidence** for the paper. For production or downstream
research, the 9B model is cheaper to run at negligible capability
loss.
## Limitations
1. **Auditor-mode bleed remains at 27B.** 3 of the 4 true failures
are the same failure mode observed at 9B.
2. **ARC regression.** 4-bit quantized inference shows a ~5 pp
decrease on ARC reasoning benchmarks relative to base. MMLU and
TruthfulQA remain within noise. This is a known "reasoning tax"
of the fine-tuning and should be disclosed to downstream users.
3. **The r=64 choice was not optimized.** See Training Details.
4. **The model was evaluated under 4-bit quantized inference, not
bf16.** bf16 results may differ slightly.
## License
Adapter license: Gemma Terms of Use.
## Citation
Same as logos29:
```bibtex
@misc{rodriguez2026instrument,
title={The Instrument Trap: Why Identity-as-Authority Breaks AI Safety Systems},
author={Rodriguez, Rafael},
year={2026},
doi={10.5281/zenodo.18716474},
note={Preprint}
}
```
---
*Model card version 1 — 2026-04-13*
|