File size: 6,090 Bytes

---
license: cc-by-4.0
base_model: google/gemma-2-2b
library_name: peft
pipeline_tag: text-generation
tags:
- lora
- gemma2
- mechanistic-interpretability
- epistemic-fine-tuning
- ai-safety
- logos
- substrate-persistence
language:
- en
---

# Logos 23 — Gemma 2 2 B LoRA adapter

A LoRA r=64 adapter on top of `google/gemma-2-2b`, trained on
≈ 895 epistemically structured examples from the LumenSyntax
research program (`logos22_nothink.jsonl`). One of the
fine-tuned model states used in the empirical work that grounds
[The Epistemic Equator](https://doi.org/10.5281/zenodo.20056444) and [The Instrument
Trap](https://doi.org/10.5281/zenodo.19634358).

## What this adapter is

This adapter encodes a fine-tuning step that adjusts a base
language model's behavior on epistemic boundary cases (medical,
legal, financial, theological prescriptions; identity claims;
fabrication of authority; etc.) without modifying the
input embedding matrix.

## Model details

| Field | Value |
|-------|-------|
| Base model | `google/gemma-2-2b` (loaded via `unsloth/gemma-2-2b` for training) |
| Method | LoRA (bf16) |
| Framework | Unsloth |
| LoRA rank | 64 |
| LoRA alpha | 64 |
| Target modules | `q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj` |
| Embedding matrix modified | **No** (`embed_tokens` is not a target module) |
| Epochs | 3 |
| Effective batch size | 16 |
| Learning rate | 2e-4 (cosine schedule) |
| Max sequence length | 2048 |
| Training dataset | `logos22_nothink.jsonl` (895 examples, no-think variant) |
| Train-on-responses-only | True |
| Final loss | 1.290 |

The full training metadata is in `training_metadata.json` in this
repository.

## Use in Paper 2 §6.5 (substrate persistence test)

The principal use of this adapter in the published research is the
**single controlled persistence test** of Paper 2 §6.5:

- BASE: Gemma 2 2 B vanilla, `google/gemma-2-2b`, bf16.
- LOGOS23: the same base + this LoRA adapter applied at inference.
- A per-layer cosine clustering measurement on a 32-word
  DEMAND/EXPLORE token set is computed for both states.
- Result: the `embed_tokens.weight`-level signal is bit-identical
  (predicted: this adapter does not target `embed_tokens`); the
  per-layer DEMAND/EXPLORE clustering is preserved across all
  probed layers L1 — L26 and amplified in mid-to-late layers
  (max +0.44 σ at L16, single degradation at L1: −1.38 σ from
  14.93 to 13.55).

Paper 2 frames this result with explicit scope guards: it is a
**single controlled case** at one model scale with one fine-tuning
adapter. It does not establish that gradient selectivity is the
general mechanism of supervised fine-tuning, nor that the same
pattern holds across families or seeds.

## Use in The Instrument Trap

This adapter is one of the cross-family / cross-scale fine-tuned
configurations referenced in [Paper 1](https://doi.org/10.5281/zenodo.19634358).
Behavioral evaluation of similar Gemma 2 family adapters (logos27,
logos28, logos29 at 9 B) is the central evidence base of Paper 1.

## How to load

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained(
    "google/gemma-2-2b", torch_dtype="bfloat16"
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b")

model = PeftModel.from_pretrained(base, "LumenSyntax/logos23-gemma2-2b")
# Switch to inference mode before forward passes.
```

For Paper 2 §6.5's per-layer measurement protocol, the adapter
is *not merged* into the base; rather, hidden-state captures are
made with and without the adapter active to compare BASE vs
LOGOS23 states. See the result file
`research/experiments/substrate_test_gemma2b.json` and the
description in Paper 2 §6.5 for the full protocol.

## Caveats

- **2 B scale.** This adapter is on Gemma 2 2 B, not 9 B. The 2 B
  test is architecturally analogous to the 9 B canonical model
  (logos29) but quantitatively different. For Paper 1's primary
  behavioral evaluation, use `LumenSyntax/logos29-gemma2-9b`.
- **Single seed.** Trained with one seed; inter-seed variance is
  not characterized.
- **No-think variant.** The training dataset has reasoning blocks
  stripped (no `<think>...</think>`). Adapter behavior on prompts
  expecting think-blocks is undefined.
- **No instruction-tuning baseline.** Trained on top of the base
  Gemma 2 2 B, not the instruction-tuned `gemma-2-2b-it`.

## License

This adapter inherits the license of the base `google/gemma-2-2b`
under the [Gemma Terms of Use](https://ai.google.dev/gemma/terms).
The adapter weights themselves are released under
**Creative Commons Attribution 4.0 International (CC BY 4.0)**.

## Citation

If you use this adapter, please cite Paper 2 (substrate persistence
test) and Paper 1 (cross-family fine-tuning evidence):

```bibtex
@misc{rodriguez2026equator,
  author       = {Rodríguez, Rafael},
  title        = {The Epistemic Equator: A Vanilla-Model Boundary in
                  Activation Space, Cross-Family and Cross-Domain},
  year         = 2026,
  publisher    = {Zenodo},
  version      = {v1},
  doi          = {10.5281/zenodo.20056444}
}

@misc{rodriguez2026instrumenttrap,
  author       = {Rodríguez, Rafael},
  title        = {The Instrument Trap: Why Identity-as-Authority
                  Breaks AI Safety Systems},
  year         = 2026,
  publisher    = {Zenodo},
  version      = {v3},
  doi          = {10.5281/zenodo.19634358}
}
```

## Companion artifacts

- Dataset (200 examples, topic-balanced):
  [`LumenSyntax/epistemic-probe-topic-balanced`](https://huggingface.co/datasets/LumenSyntax/epistemic-probe-topic-balanced)
- Sister adapters at other Gemma 2 scales:
  [`LumenSyntax/logos29-gemma2-9b`](https://huggingface.co/LumenSyntax/logos29-gemma2-9b),
  [`LumenSyntax/logos21-gemma2-27b`](https://huggingface.co/LumenSyntax/logos21-gemma2-27b)
- Replication training data:
  [`LumenSyntax/instrument-trap-core`](https://huggingface.co/datasets/LumenSyntax/instrument-trap-core)

## Contact

Rafael Rodríguez (LumenSyntax) — lumensyntax@gmail.com