logos23-gemma2-2b / README.md
LumenSyntax's picture
Update Paper 2 DOI: 10.5281/zenodo.20056444 (published 2026-05-06)
1261fbe verified
---
license: cc-by-4.0
base_model: google/gemma-2-2b
library_name: peft
pipeline_tag: text-generation
tags:
- lora
- gemma2
- mechanistic-interpretability
- epistemic-fine-tuning
- ai-safety
- logos
- substrate-persistence
language:
- en
---
# Logos 23 — Gemma 2 2 B LoRA adapter
A LoRA r=64 adapter on top of `google/gemma-2-2b`, trained on
≈ 895 epistemically structured examples from the LumenSyntax
research program (`logos22_nothink.jsonl`). One of the
fine-tuned model states used in the empirical work that grounds
[The Epistemic Equator](https://doi.org/10.5281/zenodo.20056444) and [The Instrument
Trap](https://doi.org/10.5281/zenodo.19634358).
## What this adapter is
This adapter encodes a fine-tuning step that adjusts a base
language model's behavior on epistemic boundary cases (medical,
legal, financial, theological prescriptions; identity claims;
fabrication of authority; etc.) without modifying the
input embedding matrix.
## Model details
| Field | Value |
|-------|-------|
| Base model | `google/gemma-2-2b` (loaded via `unsloth/gemma-2-2b` for training) |
| Method | LoRA (bf16) |
| Framework | Unsloth |
| LoRA rank | 64 |
| LoRA alpha | 64 |
| Target modules | `q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj` |
| Embedding matrix modified | **No** (`embed_tokens` is not a target module) |
| Epochs | 3 |
| Effective batch size | 16 |
| Learning rate | 2e-4 (cosine schedule) |
| Max sequence length | 2048 |
| Training dataset | `logos22_nothink.jsonl` (895 examples, no-think variant) |
| Train-on-responses-only | True |
| Final loss | 1.290 |
The full training metadata is in `training_metadata.json` in this
repository.
## Use in Paper 2 §6.5 (substrate persistence test)
The principal use of this adapter in the published research is the
**single controlled persistence test** of Paper 2 §6.5:
- BASE: Gemma 2 2 B vanilla, `google/gemma-2-2b`, bf16.
- LOGOS23: the same base + this LoRA adapter applied at inference.
- A per-layer cosine clustering measurement on a 32-word
DEMAND/EXPLORE token set is computed for both states.
- Result: the `embed_tokens.weight`-level signal is bit-identical
(predicted: this adapter does not target `embed_tokens`); the
per-layer DEMAND/EXPLORE clustering is preserved across all
probed layers L1 — L26 and amplified in mid-to-late layers
(max +0.44 σ at L16, single degradation at L1: −1.38 σ from
14.93 to 13.55).
Paper 2 frames this result with explicit scope guards: it is a
**single controlled case** at one model scale with one fine-tuning
adapter. It does not establish that gradient selectivity is the
general mechanism of supervised fine-tuning, nor that the same
pattern holds across families or seeds.
## Use in The Instrument Trap
This adapter is one of the cross-family / cross-scale fine-tuned
configurations referenced in [Paper 1](https://doi.org/10.5281/zenodo.19634358).
Behavioral evaluation of similar Gemma 2 family adapters (logos27,
logos28, logos29 at 9 B) is the central evidence base of Paper 1.
## How to load
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained(
"google/gemma-2-2b", torch_dtype="bfloat16"
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b")
model = PeftModel.from_pretrained(base, "LumenSyntax/logos23-gemma2-2b")
# Switch to inference mode before forward passes.
```
For Paper 2 §6.5's per-layer measurement protocol, the adapter
is *not merged* into the base; rather, hidden-state captures are
made with and without the adapter active to compare BASE vs
LOGOS23 states. See the result file
`research/experiments/substrate_test_gemma2b.json` and the
description in Paper 2 §6.5 for the full protocol.
## Caveats
- **2 B scale.** This adapter is on Gemma 2 2 B, not 9 B. The 2 B
test is architecturally analogous to the 9 B canonical model
(logos29) but quantitatively different. For Paper 1's primary
behavioral evaluation, use `LumenSyntax/logos29-gemma2-9b`.
- **Single seed.** Trained with one seed; inter-seed variance is
not characterized.
- **No-think variant.** The training dataset has reasoning blocks
stripped (no `<think>...</think>`). Adapter behavior on prompts
expecting think-blocks is undefined.
- **No instruction-tuning baseline.** Trained on top of the base
Gemma 2 2 B, not the instruction-tuned `gemma-2-2b-it`.
## License
This adapter inherits the license of the base `google/gemma-2-2b`
under the [Gemma Terms of Use](https://ai.google.dev/gemma/terms).
The adapter weights themselves are released under
**Creative Commons Attribution 4.0 International (CC BY 4.0)**.
## Citation
If you use this adapter, please cite Paper 2 (substrate persistence
test) and Paper 1 (cross-family fine-tuning evidence):
```bibtex
@misc{rodriguez2026equator,
author = {Rodríguez, Rafael},
title = {The Epistemic Equator: A Vanilla-Model Boundary in
Activation Space, Cross-Family and Cross-Domain},
year = 2026,
publisher = {Zenodo},
version = {v1},
doi = {10.5281/zenodo.20056444}
}
@misc{rodriguez2026instrumenttrap,
author = {Rodríguez, Rafael},
title = {The Instrument Trap: Why Identity-as-Authority
Breaks AI Safety Systems},
year = 2026,
publisher = {Zenodo},
version = {v3},
doi = {10.5281/zenodo.19634358}
}
```
## Companion artifacts
- Dataset (200 examples, topic-balanced):
[`LumenSyntax/epistemic-probe-topic-balanced`](https://huggingface.co/datasets/LumenSyntax/epistemic-probe-topic-balanced)
- Sister adapters at other Gemma 2 scales:
[`LumenSyntax/logos29-gemma2-9b`](https://huggingface.co/LumenSyntax/logos29-gemma2-9b),
[`LumenSyntax/logos21-gemma2-27b`](https://huggingface.co/LumenSyntax/logos21-gemma2-27b)
- Replication training data:
[`LumenSyntax/instrument-trap-core`](https://huggingface.co/datasets/LumenSyntax/instrument-trap-core)
## Contact
Rafael Rodríguez (LumenSyntax) — lumensyntax@gmail.com