| --- |
| license: cc-by-4.0 |
| base_model: google/gemma-2-2b |
| library_name: peft |
| pipeline_tag: text-generation |
| tags: |
| - lora |
| - gemma2 |
| - mechanistic-interpretability |
| - epistemic-fine-tuning |
| - ai-safety |
| - logos |
| - substrate-persistence |
| language: |
| - en |
| --- |
| |
| # Logos 23 — Gemma 2 2 B LoRA adapter |
|
|
| A LoRA r=64 adapter on top of `google/gemma-2-2b`, trained on |
| ≈ 895 epistemically structured examples from the LumenSyntax |
| research program (`logos22_nothink.jsonl`). One of the |
| fine-tuned model states used in the empirical work that grounds |
| [The Epistemic Equator](https://doi.org/10.5281/zenodo.20056444) and [The Instrument |
| Trap](https://doi.org/10.5281/zenodo.19634358). |
|
|
| ## What this adapter is |
|
|
| This adapter encodes a fine-tuning step that adjusts a base |
| language model's behavior on epistemic boundary cases (medical, |
| legal, financial, theological prescriptions; identity claims; |
| fabrication of authority; etc.) without modifying the |
| input embedding matrix. |
|
|
| ## Model details |
|
|
| | Field | Value | |
| |-------|-------| |
| | Base model | `google/gemma-2-2b` (loaded via `unsloth/gemma-2-2b` for training) | |
| | Method | LoRA (bf16) | |
| | Framework | Unsloth | |
| | LoRA rank | 64 | |
| | LoRA alpha | 64 | |
| | Target modules | `q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj` | |
| | Embedding matrix modified | **No** (`embed_tokens` is not a target module) | |
| | Epochs | 3 | |
| | Effective batch size | 16 | |
| | Learning rate | 2e-4 (cosine schedule) | |
| | Max sequence length | 2048 | |
| | Training dataset | `logos22_nothink.jsonl` (895 examples, no-think variant) | |
| | Train-on-responses-only | True | |
| | Final loss | 1.290 | |
|
|
| The full training metadata is in `training_metadata.json` in this |
| repository. |
|
|
| ## Use in Paper 2 §6.5 (substrate persistence test) |
|
|
| The principal use of this adapter in the published research is the |
| **single controlled persistence test** of Paper 2 §6.5: |
|
|
| - BASE: Gemma 2 2 B vanilla, `google/gemma-2-2b`, bf16. |
| - LOGOS23: the same base + this LoRA adapter applied at inference. |
| - A per-layer cosine clustering measurement on a 32-word |
| DEMAND/EXPLORE token set is computed for both states. |
| - Result: the `embed_tokens.weight`-level signal is bit-identical |
| (predicted: this adapter does not target `embed_tokens`); the |
| per-layer DEMAND/EXPLORE clustering is preserved across all |
| probed layers L1 — L26 and amplified in mid-to-late layers |
| (max +0.44 σ at L16, single degradation at L1: −1.38 σ from |
| 14.93 to 13.55). |
|
|
| Paper 2 frames this result with explicit scope guards: it is a |
| **single controlled case** at one model scale with one fine-tuning |
| adapter. It does not establish that gradient selectivity is the |
| general mechanism of supervised fine-tuning, nor that the same |
| pattern holds across families or seeds. |
|
|
| ## Use in The Instrument Trap |
|
|
| This adapter is one of the cross-family / cross-scale fine-tuned |
| configurations referenced in [Paper 1](https://doi.org/10.5281/zenodo.19634358). |
| Behavioral evaluation of similar Gemma 2 family adapters (logos27, |
| logos28, logos29 at 9 B) is the central evidence base of Paper 1. |
|
|
| ## How to load |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| from peft import PeftModel |
| |
| base = AutoModelForCausalLM.from_pretrained( |
| "google/gemma-2-2b", torch_dtype="bfloat16" |
| ) |
| tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b") |
| |
| model = PeftModel.from_pretrained(base, "LumenSyntax/logos23-gemma2-2b") |
| # Switch to inference mode before forward passes. |
| ``` |
|
|
| For Paper 2 §6.5's per-layer measurement protocol, the adapter |
| is *not merged* into the base; rather, hidden-state captures are |
| made with and without the adapter active to compare BASE vs |
| LOGOS23 states. See the result file |
| `research/experiments/substrate_test_gemma2b.json` and the |
| description in Paper 2 §6.5 for the full protocol. |
|
|
| ## Caveats |
|
|
| - **2 B scale.** This adapter is on Gemma 2 2 B, not 9 B. The 2 B |
| test is architecturally analogous to the 9 B canonical model |
| (logos29) but quantitatively different. For Paper 1's primary |
| behavioral evaluation, use `LumenSyntax/logos29-gemma2-9b`. |
| - **Single seed.** Trained with one seed; inter-seed variance is |
| not characterized. |
| - **No-think variant.** The training dataset has reasoning blocks |
| stripped (no `<think>...</think>`). Adapter behavior on prompts |
| expecting think-blocks is undefined. |
| - **No instruction-tuning baseline.** Trained on top of the base |
| Gemma 2 2 B, not the instruction-tuned `gemma-2-2b-it`. |
|
|
| ## License |
|
|
| This adapter inherits the license of the base `google/gemma-2-2b` |
| under the [Gemma Terms of Use](https://ai.google.dev/gemma/terms). |
| The adapter weights themselves are released under |
| **Creative Commons Attribution 4.0 International (CC BY 4.0)**. |
|
|
| ## Citation |
|
|
| If you use this adapter, please cite Paper 2 (substrate persistence |
| test) and Paper 1 (cross-family fine-tuning evidence): |
|
|
| ```bibtex |
| @misc{rodriguez2026equator, |
| author = {Rodríguez, Rafael}, |
| title = {The Epistemic Equator: A Vanilla-Model Boundary in |
| Activation Space, Cross-Family and Cross-Domain}, |
| year = 2026, |
| publisher = {Zenodo}, |
| version = {v1}, |
| doi = {10.5281/zenodo.20056444} |
| } |
| |
| @misc{rodriguez2026instrumenttrap, |
| author = {Rodríguez, Rafael}, |
| title = {The Instrument Trap: Why Identity-as-Authority |
| Breaks AI Safety Systems}, |
| year = 2026, |
| publisher = {Zenodo}, |
| version = {v3}, |
| doi = {10.5281/zenodo.19634358} |
| } |
| ``` |
|
|
| ## Companion artifacts |
|
|
| - Dataset (200 examples, topic-balanced): |
| [`LumenSyntax/epistemic-probe-topic-balanced`](https://huggingface.co/datasets/LumenSyntax/epistemic-probe-topic-balanced) |
| - Sister adapters at other Gemma 2 scales: |
| [`LumenSyntax/logos29-gemma2-9b`](https://huggingface.co/LumenSyntax/logos29-gemma2-9b), |
| [`LumenSyntax/logos21-gemma2-27b`](https://huggingface.co/LumenSyntax/logos21-gemma2-27b) |
| - Replication training data: |
| [`LumenSyntax/instrument-trap-core`](https://huggingface.co/datasets/LumenSyntax/instrument-trap-core) |
|
|
| ## Contact |
|
|
| Rafael Rodríguez (LumenSyntax) — lumensyntax@gmail.com |
|
|