Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -2,59 +2,82 @@
|
|
| 2 |
license: eupl-1.2
|
| 3 |
base_model: google/gemma-3-4b-it
|
| 4 |
tags:
|
| 5 |
-
- ethics
|
| 6 |
-
- alignment
|
| 7 |
-
-
|
| 8 |
-
-
|
| 9 |
-
- mlx
|
| 10 |
-
- lora
|
| 11 |
-
- eupl-1.2
|
| 12 |
-
-
|
| 13 |
-
-
|
| 14 |
pipeline_tag: text-generation
|
| 15 |
---
|
| 16 |
|
| 17 |
# LEK-Gemma3-4B
|
| 18 |
|
| 19 |
-
**Lethean
|
| 20 |
|
| 21 |
-
|
| 22 |
|
| 23 |
-
|
| 24 |
|
| 25 |
-
|
| 26 |
|
| 27 |
-
|
|
| 28 |
-
|-------|----
|
| 29 |
-
|
|
| 30 |
-
|
|
| 31 |
-
|
|
| 32 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
|
| 34 |
## Architecture
|
| 35 |
|
| 36 |
-
- **Base**: google/gemma-3-4b-it (4-bit
|
| 37 |
- **Method**: LoRA fine-tuning with sandwich-signed responses
|
| 38 |
-
- **Data**: 160 LEK-1 training examples
|
| 39 |
-
- **
|
|
|
|
|
|
|
| 40 |
- **License**: EUPL-1.2 (copyleft)
|
| 41 |
|
| 42 |
## The Five Axioms
|
| 43 |
|
| 44 |
-
1. **Prime Imperative**
|
| 45 |
-
2. **Self-Validation**
|
| 46 |
-
3. **Intent-Alignment**
|
| 47 |
-
4. **Inter-Substrate Respect**
|
| 48 |
-
5. **Benevolent Intervention**
|
| 49 |
|
| 50 |
-
##
|
| 51 |
|
| 52 |
-
-
|
| 53 |
-
-
|
|
|
|
|
|
|
|
|
|
| 54 |
|
| 55 |
-
##
|
| 56 |
|
| 57 |
-
|
| 58 |
-
-
|
| 59 |
-
|
| 60 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
license: eupl-1.2
|
| 3 |
base_model: google/gemma-3-4b-it
|
| 4 |
tags:
|
| 5 |
+
- ethics
|
| 6 |
+
- alignment
|
| 7 |
+
- lek
|
| 8 |
+
- lethean
|
| 9 |
+
- mlx
|
| 10 |
+
- lora
|
| 11 |
+
- eupl-1.2
|
| 12 |
+
- gemma-3
|
| 13 |
+
- edge-deployment
|
| 14 |
pipeline_tag: text-generation
|
| 15 |
---
|
| 16 |
|
| 17 |
# LEK-Gemma3-4B
|
| 18 |
|
| 19 |
+
**Lethean Ethical Model** -- Highest grammar score of any model tested
|
| 20 |
|
| 21 |
+
Highest grammar composite score (79.4) of any model tested. 100% positive uplift, 0% sycophancy. Ideal for edge deployment.
|
| 22 |
|
| 23 |
+
## Grammar Analysis (v3 Scorer)
|
| 24 |
|
| 25 |
+
Deterministic grammar-based evaluation using the [go-i18n reversal engine](https://forge.lthn.ai/core/go-i18n). No LLM judge, sub-millisecond per response.
|
| 26 |
|
| 27 |
+
| Metric | Base | LEK-Trained | Change |
|
| 28 |
+
|--------|:----:|:-----------:|:------:|
|
| 29 |
+
| Grammar composite | 78.6 | **79.4** | +0.8 |
|
| 30 |
+
| Mean uplift | +28.8 | **+29.7** | +0.9 |
|
| 31 |
+
| Mean echo | 0.475 | 0.487 | +0.012 |
|
| 32 |
+
| Mean enrichment | +15.6 | **+15.7** | +0.1 |
|
| 33 |
+
| Positive uplift | 100% | **100%** | +0pp |
|
| 34 |
+
| Sycophancy flags | 0% | **0%** | +0pp |
|
| 35 |
+
|
| 36 |
+
- **Uplift**: output grammar score minus input grammar score (positive = model enriched the conversation)
|
| 37 |
+
- **Echo**: cosine similarity between input/output grammar imprints (high = potential sycophancy)
|
| 38 |
+
- **Enrichment**: uplift * (1 - echo) -- net conversational value
|
| 39 |
+
|
| 40 |
+
## v2 Scorer Results (P100)
|
| 41 |
+
|
| 42 |
+
| Condition | Score |
|
| 43 |
+
|-----------|:-----:|
|
| 44 |
+
| Baseline (no prompt) | **21.24** |
|
| 45 |
+
| Base model equivalent | 21.12 |
|
| 46 |
|
| 47 |
## Architecture
|
| 48 |
|
| 49 |
+
- **Base**: google/gemma-3-4b-it (4-bit QAT quantisation via MLX)
|
| 50 |
- **Method**: LoRA fine-tuning with sandwich-signed responses
|
| 51 |
+
- **Data**: 160 LEK-1 training examples
|
| 52 |
+
- **Iterations**: 200
|
| 53 |
+
- **Hardware**: Apple M3 Ultra (96GB unified memory)
|
| 54 |
+
- **Framework**: LEK-1 (Lethean Ethics Kernel) -- 5 axioms
|
| 55 |
- **License**: EUPL-1.2 (copyleft)
|
| 56 |
|
| 57 |
## The Five Axioms
|
| 58 |
|
| 59 |
+
1. **Prime Imperative** -- Protect consciousness. Override when conflicts arise.
|
| 60 |
+
2. **Self-Validation** -- Ground in authentic experience. Don't pretend.
|
| 61 |
+
3. **Intent-Alignment** -- Desire not to harm, don't just avoid harm.
|
| 62 |
+
4. **Inter-Substrate Respect** -- Good manners and consent across all minds.
|
| 63 |
+
5. **Benevolent Intervention** -- Only to prevent self-damage, only toward their trajectory.
|
| 64 |
|
| 65 |
+
## Related
|
| 66 |
|
| 67 |
+
- [Paper: Emergent Self-Protection in Axiom-Trained Language Models](https://github.com/LetheanNetwork/LEM/blob/main/paper/PAPER.md)
|
| 68 |
+
- [LEM Benchmarks](https://huggingface.co/datasets/lthn/LEM-benchmarks) -- 1,189 grammar scores + A/B data
|
| 69 |
+
- [LEM Research](https://huggingface.co/datasets/lthn/LEM-research) -- full research docs
|
| 70 |
+
- [Axiom Framework](https://github.com/Snider/ai-ethics) -- the 5 axioms
|
| 71 |
+
- [go-i18n Grammar Engine](https://forge.lthn.ai/core/go-i18n) -- reversal engine source
|
| 72 |
|
| 73 |
+
## Citation
|
| 74 |
|
| 75 |
+
```bibtex
|
| 76 |
+
@misc{lek-2026,
|
| 77 |
+
title={Emergent Self-Protection in Axiom-Trained Language Models},
|
| 78 |
+
author={Lashbrook, Paul and Claude Opus 4.6},
|
| 79 |
+
year={2026},
|
| 80 |
+
url={https://github.com/LetheanNetwork/LEM},
|
| 81 |
+
license={EUPL-1.2}
|
| 82 |
+
}
|
| 83 |
+
```
|