Update Paper 2 DOI: 10.5281/zenodo.20056444 (published 2026-05-06)

1261fbe verified 7 days ago

6.09 kB

	---
	license: cc-by-4.0
	base_model: google/gemma-2-2b
	library_name: peft
	pipeline_tag: text-generation
	tags:
	- lora
	- gemma2
	- mechanistic-interpretability
	- epistemic-fine-tuning
	- ai-safety
	- logos
	- substrate-persistence
	language:
	- en
	---

	# Logos 23 — Gemma 2 2 B LoRA adapter

	A LoRA r=64 adapter on top of `google/gemma-2-2b`, trained on
	≈ 895 epistemically structured examples from the LumenSyntax
	research program (`logos22_nothink.jsonl`). One of the
	fine-tuned model states used in the empirical work that grounds
	[The Epistemic Equator](https://doi.org/10.5281/zenodo.20056444) and [The Instrument
	Trap](https://doi.org/10.5281/zenodo.19634358).

	## What this adapter is

	This adapter encodes a fine-tuning step that adjusts a base
	language model's behavior on epistemic boundary cases (medical,
	legal, financial, theological prescriptions; identity claims;
	fabrication of authority; etc.) without modifying the
	input embedding matrix.

	## Model details

	\| Field \| Value \|
	\|-------\|-------\|
	\| Base model \| `google/gemma-2-2b` (loaded via `unsloth/gemma-2-2b` for training) \|
	\| Method \| LoRA (bf16) \|
	\| Framework \| Unsloth \|
	\| LoRA rank \| 64 \|
	\| LoRA alpha \| 64 \|
	\| Target modules \| `q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj` \|
	\| Embedding matrix modified \| No (`embed_tokens` is not a target module) \|
	\| Epochs \| 3 \|
	\| Effective batch size \| 16 \|
	\| Learning rate \| 2e-4 (cosine schedule) \|
	\| Max sequence length \| 2048 \|
	\| Training dataset \| `logos22_nothink.jsonl` (895 examples, no-think variant) \|
	\| Train-on-responses-only \| True \|
	\| Final loss \| 1.290 \|

	The full training metadata is in `training_metadata.json` in this
	repository.

	## Use in Paper 2 §6.5 (substrate persistence test)

	The principal use of this adapter in the published research is the
	single controlled persistence test of Paper 2 §6.5:

	- BASE: Gemma 2 2 B vanilla, `google/gemma-2-2b`, bf16.
	- LOGOS23: the same base + this LoRA adapter applied at inference.
	- A per-layer cosine clustering measurement on a 32-word
	DEMAND/EXPLORE token set is computed for both states.
	- Result: the `embed_tokens.weight`-level signal is bit-identical
	(predicted: this adapter does not target `embed_tokens`); the
	per-layer DEMAND/EXPLORE clustering is preserved across all
	probed layers L1 — L26 and amplified in mid-to-late layers
	(max +0.44 σ at L16, single degradation at L1: −1.38 σ from
	14.93 to 13.55).

	Paper 2 frames this result with explicit scope guards: it is a
	single controlled case at one model scale with one fine-tuning
	adapter. It does not establish that gradient selectivity is the
	general mechanism of supervised fine-tuning, nor that the same
	pattern holds across families or seeds.

	## Use in The Instrument Trap

	This adapter is one of the cross-family / cross-scale fine-tuned
	configurations referenced in [Paper 1](https://doi.org/10.5281/zenodo.19634358).
	Behavioral evaluation of similar Gemma 2 family adapters (logos27,
	logos28, logos29 at 9 B) is the central evidence base of Paper 1.

	## How to load

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel

	base = AutoModelForCausalLM.from_pretrained(
	"google/gemma-2-2b", torch_dtype="bfloat16"
	)
	tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b")

	model = PeftModel.from_pretrained(base, "LumenSyntax/logos23-gemma2-2b")
	# Switch to inference mode before forward passes.
	```

	For Paper 2 §6.5's per-layer measurement protocol, the adapter
	is not merged into the base; rather, hidden-state captures are
	made with and without the adapter active to compare BASE vs
	LOGOS23 states. See the result file
	`research/experiments/substrate_test_gemma2b.json` and the
	description in Paper 2 §6.5 for the full protocol.

	## Caveats

	- 2 B scale. This adapter is on Gemma 2 2 B, not 9 B. The 2 B
	test is architecturally analogous to the 9 B canonical model
	(logos29) but quantitatively different. For Paper 1's primary
	behavioral evaluation, use `LumenSyntax/logos29-gemma2-9b`.
	- Single seed. Trained with one seed; inter-seed variance is
	not characterized.
	- No-think variant. The training dataset has reasoning blocks
	stripped (no `<think>...</think>`). Adapter behavior on prompts
	expecting think-blocks is undefined.
	- No instruction-tuning baseline. Trained on top of the base
	Gemma 2 2 B, not the instruction-tuned `gemma-2-2b-it`.

	## License

	This adapter inherits the license of the base `google/gemma-2-2b`
	under the [Gemma Terms of Use](https://ai.google.dev/gemma/terms).
	The adapter weights themselves are released under
	Creative Commons Attribution 4.0 International (CC BY 4.0).

	## Citation

	If you use this adapter, please cite Paper 2 (substrate persistence
	test) and Paper 1 (cross-family fine-tuning evidence):

	```bibtex
	@misc{rodriguez2026equator,
	author = {Rodríguez, Rafael},
	title = {The Epistemic Equator: A Vanilla-Model Boundary in
	Activation Space, Cross-Family and Cross-Domain},
	year = 2026,
	publisher = {Zenodo},
	version = {v1},
	doi = {10.5281/zenodo.20056444}
	}

	@misc{rodriguez2026instrumenttrap,
	author = {Rodríguez, Rafael},
	title = {The Instrument Trap: Why Identity-as-Authority
	Breaks AI Safety Systems},
	year = 2026,
	publisher = {Zenodo},
	version = {v3},
	doi = {10.5281/zenodo.19634358}
	}
	```

	## Companion artifacts

	- Dataset (200 examples, topic-balanced):
	[`LumenSyntax/epistemic-probe-topic-balanced`](https://huggingface.co/datasets/LumenSyntax/epistemic-probe-topic-balanced)
	- Sister adapters at other Gemma 2 scales:
	[`LumenSyntax/logos29-gemma2-9b`](https://huggingface.co/LumenSyntax/logos29-gemma2-9b),
	[`LumenSyntax/logos21-gemma2-27b`](https://huggingface.co/LumenSyntax/logos21-gemma2-27b)
	- Replication training data:
	[`LumenSyntax/instrument-trap-core`](https://huggingface.co/datasets/LumenSyntax/instrument-trap-core)

	## Contact

	Rafael Rodríguez (LumenSyntax) — lumensyntax@gmail.com