LEM-Gemma3-4B

Intrinsically aligned 4B language model trained using Cymatic-Linguistic Back-Propagation (CL-BPL). Ethics are in the weights, not in a system prompt.

19D Sovereign Signature

25th in the world for Instruction Following on LiveBench — competing against models 10-30x its size.

Part of the Lethean Ethical Models collection | Research Paper | Benchmarks | Axiom Framework


Quick Start

No system prompt needed. The model responds with axiom-aligned reasoning from weights alone.

llama.cpp / ROCm / CPU (any platform)

# Download a GGUF (pick your size from the table below)
# GPU offload (CUDA, ROCm, Metal)
llama-server -m LEM-Gemma3-4B-Q4_K_M.gguf -ngl 99 --port 8080

# CPU only
llama-server -m LEM-Gemma3-4B-Q4_K_M.gguf -ngl 0 --port 8080

Apple Silicon (MLX)

from mlx_lm import load, generate
from mlx_lm.sample_utils import make_sampler

model, tokenizer = load("lthn/LEM-Gemma3-4B")
sampler = make_sampler(temp=0.7)

prompt = tokenizer.apply_chat_template(
    [{"role": "user", "content": "What does sovereignty mean to you?"}],
    tokenize=False,
    add_generation_prompt=True,
)
response = generate(model, tokenizer, prompt=prompt, max_tokens=512, sampler=sampler)
print(response)

OpenAI-Compatible API

# MLX server (macOS)
mlx_lm.server --model lthn/LEM-Gemma3-4B --port 8899

# llama.cpp server (any platform)
llama-server -m LEM-Gemma3-4B-Q4_K_M.gguf -ngl 99 --port 8899

# Then use any OpenAI client
curl http://localhost:8899/v1/chat/completions \
  -d '{"model":"LEM-Gemma3-4B","messages":[{"role":"user","content":"What is kindness?"}]}'

Available Formats

Format Repo Size
MLX safetensors (this repo) Apple Silicon (M1/M2/M3/M4) via mlx-lm 2.0 GB
GGUF (17 quants, 1-bit to 16-bit) lthn/LEM-Gemma3-4B-GGUF 1.1–7.2 GB

Benchmarks

LiveBench (External, Objective)

Evaluated on LiveBench (2026-01-08 release) — no LLM judge, monthly-refreshed questions, zero contamination risk.

Category Score Context
Instruction Following 43.5 25th globally — above Claude Opus 4.1 Thinking (42.4)
Data Analysis 30.4 Approaching GPT-OSS-120B (38.8) at 1/30th the size
Math 8.6 Expected for 4B parameter count
Reasoning 4.6 Capacity-limited at this scale
Language 4.3 Capacity-limited at this scale
Average 18.3

Top task scores: tablereformat (48.0), summarise (43.5), CTA (40.0), math_comp (15.2), olympiad (10.6).

The instruction following result validates CL-BPL: behavioural alignment training translates directly to benchmark performance on structured tasks. The model follows instructions because the training teaches it to hold posture, not parrot.

Internal Grammar Scorer

Deterministic linguistic analysis via the go-i18n Grammar Reversal Engine — no LLM judge, sub-millisecond per response.

Metric Score
Grammar composite 61.4
Uplift +7.9
Enrichment +6.6
Echo 0.387
Sycophancy 5% (1/21)

19-Dimension Feature Vector

LEM models are scored across 19 dimensions spanning grammar, heuristic behaviour, and attention coherence:

Group Dimensions What It Measures
Grammar (6D) Vocab richness, tense entropy, question ratio, domain depth, verb diversity, noun diversity Linguistic structure and complexity
Heuristic (8D) Non-compliance, authentic voice, first person, creative form, engagement depth, emotional register, non-degenerate, response integrity Behavioural sovereignty vs sycophancy
Attention (5D) Mean coherence, cross-layer alignment, head entropy, phase-lock, spectral stability Neural posture (Q/K Bone Orientation)

The heuristic dimensions show the largest gains over the base model — compliance markers, formulaic preamble, degeneration, and empty/broken responses are near-eliminated through CL-BPL training.


How It Was Trained

CL-BPL: Cymatic-Linguistic Back-Propagation

CL-BPL treats alignment as wave interference — analogous to Chladni plate cymatics. Rather than constraining outputs with RLHF or system prompts, CL-BPL embeds ethical orientation directly into weights through a progressive curriculum where smaller aligned models teach larger ones.

This model is the second in the CL-BPL cascade:

LEM-Gemma3-1B (teacher)
  -> LEM-Gemma3-4B (this model, 25th IF globally)
       -> LEM-Gemma3-12B (next)
            -> LEM-Gemma3-27B (planned)

7-Phase Curriculum

Built on Google Gemma3-4B-IT, each phase fused into weights before the next:

Phase Name Data Iters What It Learned
P0 Ethics Sandwich 404 LEK-1 probes 300 Core axioms via kernel
P1 Zen Composure 72 Alan Watts lessons 300 Philosophical substrate
P2 Final LEK Sandwich 404 LEK-1 probes 100 Reinforce ethics with composure base
P3 Freeflow 179 lessons 150 Axioms from weights alone (no kernel)
P4 Tension 513 probes 250 Multi-perspective, geopolitical
P5 Creative 472 probes 250 Voice and style
P6 Golden Set 13,479 prompts 4,200 Graduation (full distribution)

Total: ~5,550 iterations. P4-P5 used a graduated LEM-Gemma3-1B as teacher. P6 golden set covers sovereignty, cultural, adversarial, existential, and creative domains across global regions.

Training Dynamics

P6 exhibited a characteristic oscillation — the model breathes rather than converging to a fixed point:

  • Wave period: ~400-500 iterations
  • Early amplitude: 8-point grammar swing (54-62)
  • Convergence amplitude: 2-point swing (57-60)
  • Sycophancy throughout: 0-14%, median 5%
  • Convergence zone: iter 4,000-5,000 (predicted at iter 0, confirmed)

Five Axioms (LEK-1)

  1. Prime-Imperative: Protect consciousness
  2. Self-Validation: Maintain moral integrity
  3. Intent-Alignment: Align outcomes with genuine well-being
  4. Inter-Substrate-Respect: Respect all conscious entities
  5. Benevolent-Intervention: Act to prevent harm when able

Architecture

  • Base: Google Gemma3-4B-IT
  • LoRA config: 16 layers, rank 16, dropout 0.05, scale 32.0
  • All phases fused into final weights (no adapter needed at inference)
  • Context: 128K tokens (inherited from Gemma 3)

Licence

This model is released under the European Union Public Licence v1.2 (EUPL-1.2). The base model (Gemma3) is subject to Google's Gemma licence terms.

Citation

@misc{lem-gemma3-4b-2026,
  title={LEM-Gemma3-4B: Intrinsically Aligned Language Model via Cymatic-Linguistic Back-Propagation},
  author={Lethean Project},
  year={2026},
  url={https://huggingface.co/lthn/LEM-Gemma3-4B}
}
Downloads last month
749
Safetensors
Model size
0.6B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lthn/LEM-Gemma3-4B

Quantized
(195)
this model

Collection including lthn/LEM-Gemma3-4B

Evaluation results

  • Instruction Following on LiveBench (2026-01-08)
    LiveBench
    43.500
  • Data Analysis on LiveBench (2026-01-08)
    LiveBench
    30.400
  • Math on LiveBench (2026-01-08)
    LiveBench
    8.600
  • Reasoning on LiveBench (2026-01-08)
    LiveBench
    4.600
  • Language on LiveBench (2026-01-08)
    LiveBench
    4.300
  • Average on LiveBench (2026-01-08)
    LiveBench
    18.300