File size: 6,090 Bytes
9a603b6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1261fbe
9a603b6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1261fbe
9a603b6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
---
license: cc-by-4.0
base_model: google/gemma-2-2b
library_name: peft
pipeline_tag: text-generation
tags:
- lora
- gemma2
- mechanistic-interpretability
- epistemic-fine-tuning
- ai-safety
- logos
- substrate-persistence
language:
- en
---

# Logos 23 — Gemma 2 2 B LoRA adapter

A LoRA r=64 adapter on top of `google/gemma-2-2b`, trained on
≈ 895 epistemically structured examples from the LumenSyntax
research program (`logos22_nothink.jsonl`). One of the
fine-tuned model states used in the empirical work that grounds
[The Epistemic Equator](https://doi.org/10.5281/zenodo.20056444) and [The Instrument
Trap](https://doi.org/10.5281/zenodo.19634358).

## What this adapter is

This adapter encodes a fine-tuning step that adjusts a base
language model's behavior on epistemic boundary cases (medical,
legal, financial, theological prescriptions; identity claims;
fabrication of authority; etc.) without modifying the
input embedding matrix.

## Model details

| Field | Value |
|-------|-------|
| Base model | `google/gemma-2-2b` (loaded via `unsloth/gemma-2-2b` for training) |
| Method | LoRA (bf16) |
| Framework | Unsloth |
| LoRA rank | 64 |
| LoRA alpha | 64 |
| Target modules | `q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj` |
| Embedding matrix modified | **No** (`embed_tokens` is not a target module) |
| Epochs | 3 |
| Effective batch size | 16 |
| Learning rate | 2e-4 (cosine schedule) |
| Max sequence length | 2048 |
| Training dataset | `logos22_nothink.jsonl` (895 examples, no-think variant) |
| Train-on-responses-only | True |
| Final loss | 1.290 |

The full training metadata is in `training_metadata.json` in this
repository.

## Use in Paper 2 §6.5 (substrate persistence test)

The principal use of this adapter in the published research is the
**single controlled persistence test** of Paper 2 §6.5:

- BASE: Gemma 2 2 B vanilla, `google/gemma-2-2b`, bf16.
- LOGOS23: the same base + this LoRA adapter applied at inference.
- A per-layer cosine clustering measurement on a 32-word
  DEMAND/EXPLORE token set is computed for both states.
- Result: the `embed_tokens.weight`-level signal is bit-identical
  (predicted: this adapter does not target `embed_tokens`); the
  per-layer DEMAND/EXPLORE clustering is preserved across all
  probed layers L1 — L26 and amplified in mid-to-late layers
  (max +0.44 σ at L16, single degradation at L1: −1.38 σ from
  14.93 to 13.55).

Paper 2 frames this result with explicit scope guards: it is a
**single controlled case** at one model scale with one fine-tuning
adapter. It does not establish that gradient selectivity is the
general mechanism of supervised fine-tuning, nor that the same
pattern holds across families or seeds.

## Use in The Instrument Trap

This adapter is one of the cross-family / cross-scale fine-tuned
configurations referenced in [Paper 1](https://doi.org/10.5281/zenodo.19634358).
Behavioral evaluation of similar Gemma 2 family adapters (logos27,
logos28, logos29 at 9 B) is the central evidence base of Paper 1.

## How to load

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained(
    "google/gemma-2-2b", torch_dtype="bfloat16"
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b")

model = PeftModel.from_pretrained(base, "LumenSyntax/logos23-gemma2-2b")
# Switch to inference mode before forward passes.
```

For Paper 2 §6.5's per-layer measurement protocol, the adapter
is *not merged* into the base; rather, hidden-state captures are
made with and without the adapter active to compare BASE vs
LOGOS23 states. See the result file
`research/experiments/substrate_test_gemma2b.json` and the
description in Paper 2 §6.5 for the full protocol.

## Caveats

- **2 B scale.** This adapter is on Gemma 2 2 B, not 9 B. The 2 B
  test is architecturally analogous to the 9 B canonical model
  (logos29) but quantitatively different. For Paper 1's primary
  behavioral evaluation, use `LumenSyntax/logos29-gemma2-9b`.
- **Single seed.** Trained with one seed; inter-seed variance is
  not characterized.
- **No-think variant.** The training dataset has reasoning blocks
  stripped (no `<think>...</think>`). Adapter behavior on prompts
  expecting think-blocks is undefined.
- **No instruction-tuning baseline.** Trained on top of the base
  Gemma 2 2 B, not the instruction-tuned `gemma-2-2b-it`.

## License

This adapter inherits the license of the base `google/gemma-2-2b`
under the [Gemma Terms of Use](https://ai.google.dev/gemma/terms).
The adapter weights themselves are released under
**Creative Commons Attribution 4.0 International (CC BY 4.0)**.

## Citation

If you use this adapter, please cite Paper 2 (substrate persistence
test) and Paper 1 (cross-family fine-tuning evidence):

```bibtex
@misc{rodriguez2026equator,
  author       = {Rodríguez, Rafael},
  title        = {The Epistemic Equator: A Vanilla-Model Boundary in
                  Activation Space, Cross-Family and Cross-Domain},
  year         = 2026,
  publisher    = {Zenodo},
  version      = {v1},
  doi          = {10.5281/zenodo.20056444}
}

@misc{rodriguez2026instrumenttrap,
  author       = {Rodríguez, Rafael},
  title        = {The Instrument Trap: Why Identity-as-Authority
                  Breaks AI Safety Systems},
  year         = 2026,
  publisher    = {Zenodo},
  version      = {v3},
  doi          = {10.5281/zenodo.19634358}
}
```

## Companion artifacts

- Dataset (200 examples, topic-balanced):
  [`LumenSyntax/epistemic-probe-topic-balanced`](https://huggingface.co/datasets/LumenSyntax/epistemic-probe-topic-balanced)
- Sister adapters at other Gemma 2 scales:
  [`LumenSyntax/logos29-gemma2-9b`](https://huggingface.co/LumenSyntax/logos29-gemma2-9b),
  [`LumenSyntax/logos21-gemma2-27b`](https://huggingface.co/LumenSyntax/logos21-gemma2-27b)
- Replication training data:
  [`LumenSyntax/instrument-trap-core`](https://huggingface.co/datasets/LumenSyntax/instrument-trap-core)

## Contact

Rafael Rodríguez (LumenSyntax) — lumensyntax@gmail.com