--- license: cc-by-4.0 base_model: google/gemma-2-2b library_name: peft pipeline_tag: text-generation tags: - lora - gemma2 - mechanistic-interpretability - epistemic-fine-tuning - ai-safety - logos - substrate-persistence language: - en --- # Logos 23 — Gemma 2 2 B LoRA adapter A LoRA r=64 adapter on top of `google/gemma-2-2b`, trained on ≈ 895 epistemically structured examples from the LumenSyntax research program (`logos22_nothink.jsonl`). One of the fine-tuned model states used in the empirical work that grounds [The Epistemic Equator](https://doi.org/10.5281/zenodo.20056444) and [The Instrument Trap](https://doi.org/10.5281/zenodo.19634358). ## What this adapter is This adapter encodes a fine-tuning step that adjusts a base language model's behavior on epistemic boundary cases (medical, legal, financial, theological prescriptions; identity claims; fabrication of authority; etc.) without modifying the input embedding matrix. ## Model details | Field | Value | |-------|-------| | Base model | `google/gemma-2-2b` (loaded via `unsloth/gemma-2-2b` for training) | | Method | LoRA (bf16) | | Framework | Unsloth | | LoRA rank | 64 | | LoRA alpha | 64 | | Target modules | `q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj` | | Embedding matrix modified | **No** (`embed_tokens` is not a target module) | | Epochs | 3 | | Effective batch size | 16 | | Learning rate | 2e-4 (cosine schedule) | | Max sequence length | 2048 | | Training dataset | `logos22_nothink.jsonl` (895 examples, no-think variant) | | Train-on-responses-only | True | | Final loss | 1.290 | The full training metadata is in `training_metadata.json` in this repository. ## Use in Paper 2 §6.5 (substrate persistence test) The principal use of this adapter in the published research is the **single controlled persistence test** of Paper 2 §6.5: - BASE: Gemma 2 2 B vanilla, `google/gemma-2-2b`, bf16. - LOGOS23: the same base + this LoRA adapter applied at inference. - A per-layer cosine clustering measurement on a 32-word DEMAND/EXPLORE token set is computed for both states. - Result: the `embed_tokens.weight`-level signal is bit-identical (predicted: this adapter does not target `embed_tokens`); the per-layer DEMAND/EXPLORE clustering is preserved across all probed layers L1 — L26 and amplified in mid-to-late layers (max +0.44 σ at L16, single degradation at L1: −1.38 σ from 14.93 to 13.55). Paper 2 frames this result with explicit scope guards: it is a **single controlled case** at one model scale with one fine-tuning adapter. It does not establish that gradient selectivity is the general mechanism of supervised fine-tuning, nor that the same pattern holds across families or seeds. ## Use in The Instrument Trap This adapter is one of the cross-family / cross-scale fine-tuned configurations referenced in [Paper 1](https://doi.org/10.5281/zenodo.19634358). Behavioral evaluation of similar Gemma 2 family adapters (logos27, logos28, logos29 at 9 B) is the central evidence base of Paper 1. ## How to load ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel base = AutoModelForCausalLM.from_pretrained( "google/gemma-2-2b", torch_dtype="bfloat16" ) tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b") model = PeftModel.from_pretrained(base, "LumenSyntax/logos23-gemma2-2b") # Switch to inference mode before forward passes. ``` For Paper 2 §6.5's per-layer measurement protocol, the adapter is *not merged* into the base; rather, hidden-state captures are made with and without the adapter active to compare BASE vs LOGOS23 states. See the result file `research/experiments/substrate_test_gemma2b.json` and the description in Paper 2 §6.5 for the full protocol. ## Caveats - **2 B scale.** This adapter is on Gemma 2 2 B, not 9 B. The 2 B test is architecturally analogous to the 9 B canonical model (logos29) but quantitatively different. For Paper 1's primary behavioral evaluation, use `LumenSyntax/logos29-gemma2-9b`. - **Single seed.** Trained with one seed; inter-seed variance is not characterized. - **No-think variant.** The training dataset has reasoning blocks stripped (no `...`). Adapter behavior on prompts expecting think-blocks is undefined. - **No instruction-tuning baseline.** Trained on top of the base Gemma 2 2 B, not the instruction-tuned `gemma-2-2b-it`. ## License This adapter inherits the license of the base `google/gemma-2-2b` under the [Gemma Terms of Use](https://ai.google.dev/gemma/terms). The adapter weights themselves are released under **Creative Commons Attribution 4.0 International (CC BY 4.0)**. ## Citation If you use this adapter, please cite Paper 2 (substrate persistence test) and Paper 1 (cross-family fine-tuning evidence): ```bibtex @misc{rodriguez2026equator, author = {Rodríguez, Rafael}, title = {The Epistemic Equator: A Vanilla-Model Boundary in Activation Space, Cross-Family and Cross-Domain}, year = 2026, publisher = {Zenodo}, version = {v1}, doi = {10.5281/zenodo.20056444} } @misc{rodriguez2026instrumenttrap, author = {Rodríguez, Rafael}, title = {The Instrument Trap: Why Identity-as-Authority Breaks AI Safety Systems}, year = 2026, publisher = {Zenodo}, version = {v3}, doi = {10.5281/zenodo.19634358} } ``` ## Companion artifacts - Dataset (200 examples, topic-balanced): [`LumenSyntax/epistemic-probe-topic-balanced`](https://huggingface.co/datasets/LumenSyntax/epistemic-probe-topic-balanced) - Sister adapters at other Gemma 2 scales: [`LumenSyntax/logos29-gemma2-9b`](https://huggingface.co/LumenSyntax/logos29-gemma2-9b), [`LumenSyntax/logos21-gemma2-27b`](https://huggingface.co/LumenSyntax/logos21-gemma2-27b) - Replication training data: [`LumenSyntax/instrument-trap-core`](https://huggingface.co/datasets/LumenSyntax/instrument-trap-core) ## Contact Rafael Rodríguez (LumenSyntax) — lumensyntax@gmail.com