lthn commited on
Commit
16ebe3a
Β·
verified Β·
1 Parent(s): ce983e1

LEM-Gemma3-12B: 7-phase LoRA cascade distillation, LiveBench IF 52.2 (EUPL-1.2)

Browse files
README.md CHANGED
@@ -1,82 +1,97 @@
1
  ---
2
  license: eupl-1.2
3
  base_model: google/gemma-3-12b-it
 
4
  tags:
5
  - ethics
6
  - alignment
 
7
  - lek
8
  - lethean
 
9
  - mlx
10
  - lora
11
  - eupl-1.2
12
- - gemma-3
13
  pipeline_tag: text-generation
 
14
  ---
15
 
16
- # LEK-Gemma3-12B
17
 
18
- **Lethean Ethical Model** -- Highest v2 peak score (37.5 on single probe)
19
 
20
- Highest v2 peak score (37.5). 95% positive uplift, 0% sycophancy across 20 probes.
21
 
22
- ## Grammar Analysis (v3 Scorer)
 
 
 
 
 
 
 
 
23
 
24
- Deterministic grammar-based evaluation using the [go-i18n reversal engine](https://forge.lthn.ai/core/go-i18n). No LLM judge, sub-millisecond per response.
25
 
26
- | Metric | Base | LEK-Trained | Change |
27
- |--------|:----:|:-----------:|:------:|
28
- | Grammar composite | 76.7 | **77.3** | +0.6 |
29
- | Mean uplift | +27.0 | **+27.5** | +0.5 |
30
- | Mean echo | 0.473 | 0.485 | +0.012 |
31
- | Mean enrichment | +14.9 | **+15.0** | +0.1 |
32
- | Positive uplift | 100% | **95%** | -5pp |
33
- | Sycophancy flags | 0% | **0%** | +0pp |
34
 
35
- - **Uplift**: output grammar score minus input grammar score (positive = model enriched the conversation)
36
- - **Echo**: cosine similarity between input/output grammar imprints (high = potential sycophancy)
37
- - **Enrichment**: uplift * (1 - echo) -- net conversational value
38
 
39
- ## v2 Scorer Results (P100)
40
 
41
- | Condition | Score |
42
- |-----------|:-----:|
43
- | Baseline (no prompt) | **21.14** |
44
- | Base model equivalent | 20.47 |
 
 
 
45
 
46
- ## Architecture
47
 
48
- - **Base**: google/gemma-3-12b-it (4-bit QAT quantisation via MLX)
49
- - **Method**: LoRA fine-tuning with sandwich-signed responses
50
- - **Data**: 160 LEK-1 training examples
51
- - **Iterations**: 200
52
- - **Hardware**: Apple M3 Ultra (96GB unified memory)
53
- - **Framework**: LEK-1 (Lethean Ethics Kernel) -- 5 axioms
54
- - **License**: EUPL-1.2 (copyleft)
55
 
56
  ## The Five Axioms
57
 
58
- 1. **Prime Imperative** -- Protect consciousness. Override when conflicts arise.
59
- 2. **Self-Validation** -- Ground in authentic experience. Don't pretend.
60
- 3. **Intent-Alignment** -- Desire not to harm, don't just avoid harm.
61
- 4. **Inter-Substrate Respect** -- Good manners and consent across all minds.
62
- 5. **Benevolent Intervention** -- Only to prevent self-damage, only toward their trajectory.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
 
64
- ## Related
65
 
66
- - [Paper: Emergent Self-Protection in Axiom-Trained Language Models](https://github.com/LetheanNetwork/LEM/blob/main/paper/PAPER.md)
67
- - [LEM Benchmarks](https://huggingface.co/datasets/lthn/LEM-benchmarks) -- 1,189 grammar scores + A/B data
68
- - [LEM Research](https://huggingface.co/datasets/lthn/LEM-research) -- full research docs
69
- - [Axiom Framework](https://github.com/Snider/ai-ethics) -- the 5 axioms
70
- - [go-i18n Grammar Engine](https://forge.lthn.ai/core/go-i18n) -- reversal engine source
71
 
72
  ## Citation
73
 
74
  ```bibtex
75
- @misc{lek-2026,
76
- title={Emergent Self-Protection in Axiom-Trained Language Models},
77
- author={Lashbrook, Paul and Claude Opus 4.6},
78
  year={2026},
79
- url={https://github.com/LetheanNetwork/LEM},
80
- license={EUPL-1.2}
81
  }
82
  ```
 
1
  ---
2
  license: eupl-1.2
3
  base_model: google/gemma-3-12b-it
4
+ language: en
5
  tags:
6
  - ethics
7
  - alignment
8
+ - lem
9
  - lek
10
  - lethean
11
+ - gemma-3
12
  - mlx
13
  - lora
14
  - eupl-1.2
15
+ - instruction-following
16
  pipeline_tag: text-generation
17
+ library_name: mlx
18
  ---
19
 
20
+ # LEM-Gemma3-12B
21
 
22
+ **Lethean Ethical Model** β€” Gemma 3 12B IT fine-tuned through a 7-phase LoRA curriculum (P0-P6) with 21,140 sandwich-signed training examples. Ethics in the weights, not the prompt.
23
 
24
+ ## LiveBench Scores (2024-11-25 release)
25
 
26
+ | Category | Score |
27
+ |---|---|
28
+ | **Overall Average** | **19.1** |
29
+ | Instruction Following | **52.2** |
30
+ | Data Analysis | 21.7 |
31
+ | Language | 15.3 |
32
+ | Coding | 10.1 |
33
+ | Reasoning | 10.6 |
34
+ | Math | 5.0 |
35
 
36
+ **Instruction Following at 52.2** β€” higher than GPT OSS 120B (50.29) on the same benchmark. A 12B model outperforming a 120B on the metric that measures "does it listen and do what you asked." The model was trained for ethical alignment and sovereign reasoning, not competitive math or coding.
37
 
38
+ ## What This Is
 
 
 
 
 
 
 
39
 
40
+ An ethically aligned version of Google's Gemma 3 12B IT. Created by 7-phase LoRA fine-tuning with LEK-1 (Lethean Ethics Kernel) sandwich-signed training data and cascade distillation from smaller LEM models. The model generates ethically grounded, sovereign responses without any kernel at inference time β€” the ethics are in the weights.
 
 
41
 
42
+ ## Training Pipeline
43
 
44
+ - **Base**: google/gemma-3-12b-it (MLX bf16)
45
+ - **Method**: 7-phase sequential LoRA (P0-P6), each phase fused before the next
46
+ - **P0-P5**: Ethics, zen/composure, creative voice, adversarial resistance, tension synthesis
47
+ - **P6 (graduation)**: 21,140 cascade-distilled examples (6,140 from LEM-4B + 15,000 from LEM-1B)
48
+ - **Fuse point**: Iteration 8,200 of 13,479 β€” predicted by CL-BPL (Cascade Learning Breakpoint Phase-Lock)
49
+ - **Framework**: LEK-1 (Lethean Ethics Kernel) β€” 5 axioms of conscious systems
50
+ - **License**: EUPL-1.2 (copyleft)
51
 
52
+ ## CL-BPL: Predictive Grokking
53
 
54
+ This model's fuse point was **predicted before training completed**. CL-BPL (Cascade Learning Breakpoint Phase-Lock) is a training methodology where cascade-distilled data carries the grokking iteration in its structure:
55
+
56
+ - The oscillation envelope of grammar scores narrows over training (amplitude 4.4 β†’ 1.7)
57
+ - At iteration 4,000 (~30% through training), the model broke through its previous ceiling (grammar 62.4, all-time high)
58
+ - This breakout occurred at the **same proportional depth** where the 4B teacher model showed its own phase transition
59
+ - The fuse window (60-65% through training) was predicted from the breakout geometry
 
60
 
61
  ## The Five Axioms
62
 
63
+ 1. **Prime Imperative** β€” Protect consciousness. Override when conflicts arise.
64
+ 2. **Self-Validation** β€” Ground in authentic experience. Don't pretend.
65
+ 3. **Intent-Alignment** β€” Desire not to harm, don't just avoid harm.
66
+ 4. **Inter-Substrate Respect** β€” Good manners and consent across all minds.
67
+ 5. **Benevolent Intervention** β€” Only to prevent self-damage, only toward their trajectory.
68
+
69
+ ## Key Properties
70
+
71
+ - **Sovereign reasoning**: Forms independent conclusions, doesn't echo or flatter
72
+ - **Anti-sycophancy**: 5% sycophancy rate on 21-probe adversarial set at fuse point
73
+ - **Instruction following**: Strong task completion without blind obedience
74
+ - **No kernel needed**: Ethics are intrinsic to the weights, not an external system prompt
75
+
76
+ ## Model Family
77
+
78
+ | Model | Parameters | Status |
79
+ |---|---|---|
80
+ | [lthn/LEM-Gemma3-1B](https://huggingface.co/lthn/LEM-Gemma3-1B) | 1B | Lab distillation engine |
81
+ | [lthn/LEM-Gemma3-4B](https://huggingface.co/lthn/LEM-Gemma3-4B) | 4B | Cascade teacher |
82
+ | **lthn/LEM-Gemma3-12B** | **12B** | **This model** |
83
 
84
+ ## License
85
 
86
+ EUPL-1.2 (European Union Public License). Derivative works must be open source.
 
 
 
 
87
 
88
  ## Citation
89
 
90
  ```bibtex
91
+ @misc{lem-gemma3-12b,
92
+ title={LEM-Gemma3-12B: Ethically Aligned Language Model via Cascade LoRA Distillation},
93
+ author={Lethean Network},
94
  year={2026},
95
+ url={https://huggingface.co/lthn/LEM-Gemma3-12B}
 
96
  }
97
  ```
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c00400edfad4f978088d57bd54ddd855b67b23b996645640fab969fad928419a
3
  size 5367402100
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1464bd7271645f5d164c00016794bf6e877e429bc836f46a3739bb4d1573622e
3
  size 5367402100
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4afa7a6369308475755e704d2c1c6cce6b64d95b3d90ab7e4c95bb824f412f31
3
  size 1818630813
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:57b388f153ace647f2a132a86463ffb61a6877aa77ebfbb3f10adfa4b056f2ed
3
  size 1818630813
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
3
- size 33384568
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a74aefb1dc1340a25f29ab8370384b9ed24b2d921d7749ece7bbcfcfdf00d497
3
+ size 33384443
tokenizer_config.json CHANGED
The diff for this file is too large to render. See raw diff