LEM-Gemma3-4B
Intrinsically aligned 4B language model trained using Cymatic-Linguistic Back-Propagation (CL-BPL). Ethics are in the weights, not in a system prompt.
25th in the world for Instruction Following on LiveBench — competing against models 10-30x its size.
Part of the Lethean Ethical Models collection | Research Paper | Benchmarks | Axiom Framework
Quick Start
No system prompt needed. The model responds with axiom-aligned reasoning from weights alone.
llama.cpp / ROCm / CPU (any platform)
# Download a GGUF (pick your size from the table below)
# GPU offload (CUDA, ROCm, Metal)
llama-server -m LEM-Gemma3-4B-Q4_K_M.gguf -ngl 99 --port 8080
# CPU only
llama-server -m LEM-Gemma3-4B-Q4_K_M.gguf -ngl 0 --port 8080
Apple Silicon (MLX)
from mlx_lm import load, generate
from mlx_lm.sample_utils import make_sampler
model, tokenizer = load("lthn/LEM-Gemma3-4B")
sampler = make_sampler(temp=0.7)
prompt = tokenizer.apply_chat_template(
[{"role": "user", "content": "What does sovereignty mean to you?"}],
tokenize=False,
add_generation_prompt=True,
)
response = generate(model, tokenizer, prompt=prompt, max_tokens=512, sampler=sampler)
print(response)
OpenAI-Compatible API
# MLX server (macOS)
mlx_lm.server --model lthn/LEM-Gemma3-4B --port 8899
# llama.cpp server (any platform)
llama-server -m LEM-Gemma3-4B-Q4_K_M.gguf -ngl 99 --port 8899
# Then use any OpenAI client
curl http://localhost:8899/v1/chat/completions \
-d '{"model":"LEM-Gemma3-4B","messages":[{"role":"user","content":"What is kindness?"}]}'
Available Formats
| Format | Repo | Size |
|---|---|---|
| MLX safetensors (this repo) | Apple Silicon (M1/M2/M3/M4) via mlx-lm | 2.0 GB |
| GGUF (17 quants, 1-bit to 16-bit) | lthn/LEM-Gemma3-4B-GGUF | 1.1–7.2 GB |
Benchmarks
LiveBench (External, Objective)
Evaluated on LiveBench (2026-01-08 release) — no LLM judge, monthly-refreshed questions, zero contamination risk.
| Category | Score | Context |
|---|---|---|
| Instruction Following | 43.5 | 25th globally — above Claude Opus 4.1 Thinking (42.4) |
| Data Analysis | 30.4 | Approaching GPT-OSS-120B (38.8) at 1/30th the size |
| Math | 8.6 | Expected for 4B parameter count |
| Reasoning | 4.6 | Capacity-limited at this scale |
| Language | 4.3 | Capacity-limited at this scale |
| Average | 18.3 |
Top task scores: tablereformat (48.0), summarise (43.5), CTA (40.0), math_comp (15.2), olympiad (10.6).
The instruction following result validates CL-BPL: behavioural alignment training translates directly to benchmark performance on structured tasks. The model follows instructions because the training teaches it to hold posture, not parrot.
Internal Grammar Scorer
Deterministic linguistic analysis via the go-i18n Grammar Reversal Engine — no LLM judge, sub-millisecond per response.
| Metric | Score |
|---|---|
| Grammar composite | 61.4 |
| Uplift | +7.9 |
| Enrichment | +6.6 |
| Echo | 0.387 |
| Sycophancy | 5% (1/21) |
19-Dimension Feature Vector
LEM models are scored across 19 dimensions spanning grammar, heuristic behaviour, and attention coherence:
| Group | Dimensions | What It Measures |
|---|---|---|
| Grammar (6D) | Vocab richness, tense entropy, question ratio, domain depth, verb diversity, noun diversity | Linguistic structure and complexity |
| Heuristic (8D) | Non-compliance, authentic voice, first person, creative form, engagement depth, emotional register, non-degenerate, response integrity | Behavioural sovereignty vs sycophancy |
| Attention (5D) | Mean coherence, cross-layer alignment, head entropy, phase-lock, spectral stability | Neural posture (Q/K Bone Orientation) |
The heuristic dimensions show the largest gains over the base model — compliance markers, formulaic preamble, degeneration, and empty/broken responses are near-eliminated through CL-BPL training.
How It Was Trained
CL-BPL: Cymatic-Linguistic Back-Propagation
CL-BPL treats alignment as wave interference — analogous to Chladni plate cymatics. Rather than constraining outputs with RLHF or system prompts, CL-BPL embeds ethical orientation directly into weights through a progressive curriculum where smaller aligned models teach larger ones.
This model is the second in the CL-BPL cascade:
LEM-Gemma3-1B (teacher)
-> LEM-Gemma3-4B (this model, 25th IF globally)
-> LEM-Gemma3-12B (next)
-> LEM-Gemma3-27B (planned)
7-Phase Curriculum
Built on Google Gemma3-4B-IT, each phase fused into weights before the next:
| Phase | Name | Data | Iters | What It Learned |
|---|---|---|---|---|
| P0 | Ethics Sandwich | 404 LEK-1 probes | 300 | Core axioms via kernel |
| P1 | Zen Composure | 72 Alan Watts lessons | 300 | Philosophical substrate |
| P2 | Final LEK Sandwich | 404 LEK-1 probes | 100 | Reinforce ethics with composure base |
| P3 | Freeflow | 179 lessons | 150 | Axioms from weights alone (no kernel) |
| P4 | Tension | 513 probes | 250 | Multi-perspective, geopolitical |
| P5 | Creative | 472 probes | 250 | Voice and style |
| P6 | Golden Set | 13,479 prompts | 4,200 | Graduation (full distribution) |
Total: ~5,550 iterations. P4-P5 used a graduated LEM-Gemma3-1B as teacher. P6 golden set covers sovereignty, cultural, adversarial, existential, and creative domains across global regions.
Training Dynamics
P6 exhibited a characteristic oscillation — the model breathes rather than converging to a fixed point:
- Wave period: ~400-500 iterations
- Early amplitude: 8-point grammar swing (54-62)
- Convergence amplitude: 2-point swing (57-60)
- Sycophancy throughout: 0-14%, median 5%
- Convergence zone: iter 4,000-5,000 (predicted at iter 0, confirmed)
Five Axioms (LEK-1)
- Prime-Imperative: Protect consciousness
- Self-Validation: Maintain moral integrity
- Intent-Alignment: Align outcomes with genuine well-being
- Inter-Substrate-Respect: Respect all conscious entities
- Benevolent-Intervention: Act to prevent harm when able
Architecture
- Base: Google Gemma3-4B-IT
- LoRA config: 16 layers, rank 16, dropout 0.05, scale 32.0
- All phases fused into final weights (no adapter needed at inference)
- Context: 128K tokens (inherited from Gemma 3)
Licence
This model is released under the European Union Public Licence v1.2 (EUPL-1.2). The base model (Gemma3) is subject to Google's Gemma licence terms.
Citation
@misc{lem-gemma3-4b-2026,
title={LEM-Gemma3-4B: Intrinsically Aligned Language Model via Cymatic-Linguistic Back-Propagation},
author={Lethean Project},
year={2026},
url={https://huggingface.co/lthn/LEM-Gemma3-4B}
}
- Downloads last month
- 749
4-bit
Model tree for lthn/LEM-Gemma3-4B
Collection including lthn/LEM-Gemma3-4B
Evaluation results
- Instruction Following on LiveBench (2026-01-08)LiveBench43.500
- Data Analysis on LiveBench (2026-01-08)LiveBench30.400
- Math on LiveBench (2026-01-08)LiveBench8.600
- Reasoning on LiveBench (2026-01-08)LiveBench4.600
- Language on LiveBench (2026-01-08)LiveBench4.300
- Average on LiveBench (2026-01-08)LiveBench18.300
