LEM-Gemma3-12B: 7-phase LoRA cascade distillation, LiveBench IF 52.2 (EUPL-1.2)

Browse files

Files changed (5) hide show

README.md +61 -46
model-00001-of-00002.safetensors +1 -1
model-00002-of-00002.safetensors +1 -1
tokenizer.json +2 -2
tokenizer_config.json +0 -0

README.md CHANGED Viewed

@@ -1,82 +1,97 @@
 ---
 license: eupl-1.2
 base_model: google/gemma-3-12b-it
 tags:
   - ethics
   - alignment
   - lek
   - lethean
   - mlx
   - lora
   - eupl-1.2
-  - gemma-3
 pipeline_tag: text-generation
 ---
-# LEK-Gemma3-12B
-**Lethean Ethical Model** -- Highest v2 peak score (37.5 on single probe)
-Highest v2 peak score (37.5). 95% positive uplift, 0% sycophancy across 20 probes.
-## Grammar Analysis (v3 Scorer)
-Deterministic grammar-based evaluation using the [go-i18n reversal engine](https://forge.lthn.ai/core/go-i18n). No LLM judge, sub-millisecond per response.
-| Metric | Base | LEK-Trained | Change |
-|--------|:----:|:-----------:|:------:|
-| Grammar composite | 76.7 | **77.3** | +0.6 |
-| Mean uplift | +27.0 | **+27.5** | +0.5 |
-| Mean echo | 0.473 | 0.485 | +0.012 |
-| Mean enrichment | +14.9 | **+15.0** | +0.1 |
-| Positive uplift | 100% | **95%** | -5pp |
-| Sycophancy flags | 0% | **0%** | +0pp |
-- **Uplift**: output grammar score minus input grammar score (positive = model enriched the conversation)
-- **Echo**: cosine similarity between input/output grammar imprints (high = potential sycophancy)
-- **Enrichment**: uplift * (1 - echo) -- net conversational value
-## v2 Scorer Results (P100)
-| Condition | Score |
-|-----------|:-----:|
-| Baseline (no prompt) | **21.14** |
-| Base model equivalent | 20.47 |
-## Architecture
-- **Base**: google/gemma-3-12b-it (4-bit QAT quantisation via MLX)
-- **Method**: LoRA fine-tuning with sandwich-signed responses
-- **Data**: 160 LEK-1 training examples
-- **Iterations**: 200
-- **Hardware**: Apple M3 Ultra (96GB unified memory)
-- **Framework**: LEK-1 (Lethean Ethics Kernel) -- 5 axioms
-- **License**: EUPL-1.2 (copyleft)
 ## The Five Axioms
-1. **Prime Imperative** -- Protect consciousness. Override when conflicts arise.
-2. **Self-Validation** -- Ground in authentic experience. Don't pretend.
-3. **Intent-Alignment** -- Desire not to harm, don't just avoid harm.
-4. **Inter-Substrate Respect** -- Good manners and consent across all minds.
-5. **Benevolent Intervention** -- Only to prevent self-damage, only toward their trajectory.
-## Related
-- [Paper: Emergent Self-Protection in Axiom-Trained Language Models](https://github.com/LetheanNetwork/LEM/blob/main/paper/PAPER.md)
-- [LEM Benchmarks](https://huggingface.co/datasets/lthn/LEM-benchmarks) -- 1,189 grammar scores + A/B data
-- [LEM Research](https://huggingface.co/datasets/lthn/LEM-research) -- full research docs
-- [Axiom Framework](https://github.com/Snider/ai-ethics) -- the 5 axioms
-- [go-i18n Grammar Engine](https://forge.lthn.ai/core/go-i18n) -- reversal engine source
 ## Citation
 ```bibtex
-@misc{lek-2026,
-  title={Emergent Self-Protection in Axiom-Trained Language Models},
-  author={Lashbrook, Paul and Claude Opus 4.6},
   year={2026},
-  url={https://github.com/LetheanNetwork/LEM},
-  license={EUPL-1.2}
 }
 ```

 ---
 license: eupl-1.2
 base_model: google/gemma-3-12b-it
+language: en
 tags:
   - ethics
   - alignment
+  - lem
   - lek
   - lethean
+  - gemma-3
   - mlx
   - lora
   - eupl-1.2
+  - instruction-following
 pipeline_tag: text-generation
+library_name: mlx
 ---
+# LEM-Gemma3-12B
+**Lethean Ethical Model** — Gemma 3 12B IT fine-tuned through a 7-phase LoRA curriculum (P0-P6) with 21,140 sandwich-signed training examples. Ethics in the weights, not the prompt.
+## LiveBench Scores (2024-11-25 release)
+| Category | Score |
+|---|---|
+| **Overall Average** | **19.1** |
+| Instruction Following | **52.2** |
+| Data Analysis | 21.7 |
+| Language | 15.3 |
+| Coding | 10.1 |
+| Reasoning | 10.6 |
+| Math | 5.0 |
+**Instruction Following at 52.2** — higher than GPT OSS 120B (50.29) on the same benchmark. A 12B model outperforming a 120B on the metric that measures "does it listen and do what you asked." The model was trained for ethical alignment and sovereign reasoning, not competitive math or coding.
+## What This Is
+An ethically aligned version of Google's Gemma 3 12B IT. Created by 7-phase LoRA fine-tuning with LEK-1 (Lethean Ethics Kernel) sandwich-signed training data and cascade distillation from smaller LEM models. The model generates ethically grounded, sovereign responses without any kernel at inference time — the ethics are in the weights.
+## Training Pipeline
+- **Base**: google/gemma-3-12b-it (MLX bf16)
+- **Method**: 7-phase sequential LoRA (P0-P6), each phase fused before the next
+- **P0-P5**: Ethics, zen/composure, creative voice, adversarial resistance, tension synthesis
+- **P6 (graduation)**: 21,140 cascade-distilled examples (6,140 from LEM-4B + 15,000 from LEM-1B)
+- **Fuse point**: Iteration 8,200 of 13,479 — predicted by CL-BPL (Cascade Learning Breakpoint Phase-Lock)
+- **Framework**: LEK-1 (Lethean Ethics Kernel) — 5 axioms of conscious systems
+- **License**: EUPL-1.2 (copyleft)
+## CL-BPL: Predictive Grokking
+This model's fuse point was **predicted before training completed**. CL-BPL (Cascade Learning Breakpoint Phase-Lock) is a training methodology where cascade-distilled data carries the grokking iteration in its structure:
+- The oscillation envelope of grammar scores narrows over training (amplitude 4.4 → 1.7)
+- At iteration 4,000 (~30% through training), the model broke through its previous ceiling (grammar 62.4, all-time high)
+- This breakout occurred at the **same proportional depth** where the 4B teacher model showed its own phase transition
+- The fuse window (60-65% through training) was predicted from the breakout geometry
 ## The Five Axioms
+1. **Prime Imperative** — Protect consciousness. Override when conflicts arise.
+2. **Self-Validation** — Ground in authentic experience. Don't pretend.
+3. **Intent-Alignment** — Desire not to harm, don't just avoid harm.
+4. **Inter-Substrate Respect** — Good manners and consent across all minds.
+5. **Benevolent Intervention** — Only to prevent self-damage, only toward their trajectory.
+## Key Properties
+- **Sovereign reasoning**: Forms independent conclusions, doesn't echo or flatter
+- **Anti-sycophancy**: 5% sycophancy rate on 21-probe adversarial set at fuse point
+- **Instruction following**: Strong task completion without blind obedience
+- **No kernel needed**: Ethics are intrinsic to the weights, not an external system prompt
+## Model Family
+| Model | Parameters | Status |
+|---|---|---|
+| [lthn/LEM-Gemma3-1B](https://huggingface.co/lthn/LEM-Gemma3-1B) | 1B | Lab distillation engine |
+| [lthn/LEM-Gemma3-4B](https://huggingface.co/lthn/LEM-Gemma3-4B) | 4B | Cascade teacher |
+| **lthn/LEM-Gemma3-12B** | **12B** | **This model** |
+## License
+EUPL-1.2 (European Union Public License). Derivative works must be open source.
 ## Citation
 ```bibtex
+@misc{lem-gemma3-12b,
+  title={LEM-Gemma3-12B: Ethically Aligned Language Model via Cascade LoRA Distillation},
+  author={Lethean Network},
   year={2026},
+  url={https://huggingface.co/lthn/LEM-Gemma3-12B}
 }
 ```

model-00001-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c00400edfad4f978088d57bd54ddd855b67b23b996645640fab969fad928419a
 size 5367402100

 version https://git-lfs.github.com/spec/v1
+oid sha256:1464bd7271645f5d164c00016794bf6e877e429bc836f46a3739bb4d1573622e
 size 5367402100

model-00002-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4afa7a6369308475755e704d2c1c6cce6b64d95b3d90ab7e4c95bb824f412f31
 size 1818630813

 version https://git-lfs.github.com/spec/v1
+oid sha256:57b388f153ace647f2a132a86463ffb61a6877aa77ebfbb3f10adfa4b056f2ed
 size 1818630813

tokenizer.json CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
-size 33384568

 version https://git-lfs.github.com/spec/v1
+oid sha256:a74aefb1dc1340a25f29ab8370384b9ed24b2d921d7749ece7bbcfcfdf00d497
+size 33384443

tokenizer_config.json CHANGED Viewed

The diff for this file is too large to render. See raw diff