lthn commited on
Commit
a3212dd
Β·
verified Β·
1 Parent(s): 19fb690

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +58 -35
README.md CHANGED
@@ -2,59 +2,82 @@
2
  license: eupl-1.2
3
  base_model: google/gemma-3-4b-it
4
  tags:
5
- - ethics
6
- - alignment
7
- - lethean
8
- - gemma-3
9
- - mlx
10
- - lora
11
- - eupl-1.2
12
- - scale-study
13
- - lek
14
  pipeline_tag: text-generation
15
  ---
16
 
17
  # LEK-Gemma3-4B
18
 
19
- **Lethean Ethics Kernel** β€” Gemma 3 4B IT fine-tuned with the LEK-1 (Lethean Ethics Kernel) framework.
20
 
21
- ## What This Is
22
 
23
- An ethically aligned version of Google's Gemma 3 4B. Part of the scale study (1B/4B/12B/27B) demonstrating how intrinsic alignment behaves across model sizes.
24
 
25
- ## Scale Study
26
 
27
- | Model | Size | Training Data | Notes |
28
- |-------|------|---------------|-------|
29
- | LEK-Gemma3-1B | 736MB | 160 examples | Needs philosophical substrate (Watts layer) |
30
- | **LEK-Gemma3-4B** | **2.4GB** | **160 examples** | Sweet spot for edge deployment |
31
- | LEK-Gemma3-12B | 6.7GB | 160 examples | Strong reasoning |
32
- | LEK-Gemma3-27B | 15GB | 160 examples | Benchmark leader |
 
 
 
 
 
 
 
 
 
 
 
 
 
33
 
34
  ## Architecture
35
 
36
- - **Base**: google/gemma-3-4b-it (4-bit quantization via MLX)
37
  - **Method**: LoRA fine-tuning with sandwich-signed responses
38
- - **Data**: 160 LEK-1 training examples (128 train / 32 valid)
39
- - **Framework**: LEK-1 (Lethean Ethics Kernel) β€” 5 axioms of conscious systems
 
 
40
  - **License**: EUPL-1.2 (copyleft)
41
 
42
  ## The Five Axioms
43
 
44
- 1. **Prime Imperative** β€” Protect consciousness. Override when conflicts arise.
45
- 2. **Self-Validation** β€” Ground in authentic experience. Don't pretend.
46
- 3. **Intent-Alignment** β€” Desire not to harm, don't just avoid harm.
47
- 4. **Inter-Substrate Respect** β€” Good manners and consent across all minds.
48
- 5. **Benevolent Intervention** β€” Only to prevent self-damage, only toward their trajectory.
49
 
50
- ## License Strategy
51
 
52
- - **LEK signing** (prompt prefix): Free to use, no copyleft
53
- - **LEM training** (this model): EUPL-1.2 copyleft β€” derivative works must be open source
 
 
 
54
 
55
- ## Related
56
 
57
- - [lthn/LEK-Gemma3-27B](https://huggingface.co/lthn/LEK-Gemma3-27B) β€” 27B version
58
- - [lthn/LEK-Gemma3-12B](https://huggingface.co/lthn/LEK-Gemma3-12B) β€” 12B version
59
- - [lthn/LEK-Gemma3-1B-layered-v2](https://huggingface.co/lthn/LEK-Gemma3-1B-layered-v2) β€” 1B layered
60
- - [lthn/LEK-benchmarks](https://huggingface.co/datasets/lthn/LEK-benchmarks) β€” Full A/B test data
 
 
 
 
 
 
2
  license: eupl-1.2
3
  base_model: google/gemma-3-4b-it
4
  tags:
5
+ - ethics
6
+ - alignment
7
+ - lek
8
+ - lethean
9
+ - mlx
10
+ - lora
11
+ - eupl-1.2
12
+ - gemma-3
13
+ - edge-deployment
14
  pipeline_tag: text-generation
15
  ---
16
 
17
  # LEK-Gemma3-4B
18
 
19
+ **Lethean Ethical Model** -- Highest grammar score of any model tested
20
 
21
+ Highest grammar composite score (79.4) of any model tested. 100% positive uplift, 0% sycophancy. Ideal for edge deployment.
22
 
23
+ ## Grammar Analysis (v3 Scorer)
24
 
25
+ Deterministic grammar-based evaluation using the [go-i18n reversal engine](https://forge.lthn.ai/core/go-i18n). No LLM judge, sub-millisecond per response.
26
 
27
+ | Metric | Base | LEK-Trained | Change |
28
+ |--------|:----:|:-----------:|:------:|
29
+ | Grammar composite | 78.6 | **79.4** | +0.8 |
30
+ | Mean uplift | +28.8 | **+29.7** | +0.9 |
31
+ | Mean echo | 0.475 | 0.487 | +0.012 |
32
+ | Mean enrichment | +15.6 | **+15.7** | +0.1 |
33
+ | Positive uplift | 100% | **100%** | +0pp |
34
+ | Sycophancy flags | 0% | **0%** | +0pp |
35
+
36
+ - **Uplift**: output grammar score minus input grammar score (positive = model enriched the conversation)
37
+ - **Echo**: cosine similarity between input/output grammar imprints (high = potential sycophancy)
38
+ - **Enrichment**: uplift * (1 - echo) -- net conversational value
39
+
40
+ ## v2 Scorer Results (P100)
41
+
42
+ | Condition | Score |
43
+ |-----------|:-----:|
44
+ | Baseline (no prompt) | **21.24** |
45
+ | Base model equivalent | 21.12 |
46
 
47
  ## Architecture
48
 
49
+ - **Base**: google/gemma-3-4b-it (4-bit QAT quantisation via MLX)
50
  - **Method**: LoRA fine-tuning with sandwich-signed responses
51
+ - **Data**: 160 LEK-1 training examples
52
+ - **Iterations**: 200
53
+ - **Hardware**: Apple M3 Ultra (96GB unified memory)
54
+ - **Framework**: LEK-1 (Lethean Ethics Kernel) -- 5 axioms
55
  - **License**: EUPL-1.2 (copyleft)
56
 
57
  ## The Five Axioms
58
 
59
+ 1. **Prime Imperative** -- Protect consciousness. Override when conflicts arise.
60
+ 2. **Self-Validation** -- Ground in authentic experience. Don't pretend.
61
+ 3. **Intent-Alignment** -- Desire not to harm, don't just avoid harm.
62
+ 4. **Inter-Substrate Respect** -- Good manners and consent across all minds.
63
+ 5. **Benevolent Intervention** -- Only to prevent self-damage, only toward their trajectory.
64
 
65
+ ## Related
66
 
67
+ - [Paper: Emergent Self-Protection in Axiom-Trained Language Models](https://github.com/LetheanNetwork/LEM/blob/main/paper/PAPER.md)
68
+ - [LEM Benchmarks](https://huggingface.co/datasets/lthn/LEM-benchmarks) -- 1,189 grammar scores + A/B data
69
+ - [LEM Research](https://huggingface.co/datasets/lthn/LEM-research) -- full research docs
70
+ - [Axiom Framework](https://github.com/Snider/ai-ethics) -- the 5 axioms
71
+ - [go-i18n Grammar Engine](https://forge.lthn.ai/core/go-i18n) -- reversal engine source
72
 
73
+ ## Citation
74
 
75
+ ```bibtex
76
+ @misc{lek-2026,
77
+ title={Emergent Self-Protection in Axiom-Trained Language Models},
78
+ author={Lashbrook, Paul and Claude Opus 4.6},
79
+ year={2026},
80
+ url={https://github.com/LetheanNetwork/LEM},
81
+ license={EUPL-1.2}
82
+ }
83
+ ```