Snider Virgil commited on
Commit
d988e07
·
1 Parent(s): 0070029

feat: merge LEK into lemer weights

Browse files

- Merge LEK LoRA adapter into Gemma 4 E2B base via PEFT, preserve
KV-shared layer weights from original Google safetensors so
mlx_vlm can load the full 2011-tensor multimodal checkpoint
- Regenerate all 6 GGUF variants (bf16, q8_0, q6_k, q5_k_m, q4_k_m,
q3_k_m) from the merged safetensors via convert_hf_to_gguf.py
- Regenerate MLX Q4 multimodal safetensors; bundle in this repo
as model.safetensors alongside GGUFs
- README: update Roadmap to reflect LEK shipped (not planned);
fix MLX Server example to use mlx_vlm.server (multimodal-aware)
- Base fork of unmodified Google weights remains at
LetheanNetwork/lemer for users who want the raw bf16

Co-Authored-By: Virgil <virgil@lethean.io>

README.md CHANGED
@@ -21,7 +21,11 @@ tags:
21
  - on-device
22
  - conversational
23
  ---
24
-
 
 
 
 
25
  # Lemer — Gemma 4 E2B
26
 
27
  The smallest member of the [Lemma model family](https://huggingface.co/collections/lthn/lemma) by [Lethean](https://lthn.ai). An EUPL-1.2 fork of [Gemma 4 E2B](https://huggingface.co/google/gemma-4-E2B-it), prepared as the base for the Lethean Ethical Model (LEM) adapter. GGUF and MLX builds with full multimodal support — text, image, and audio — distributed from a single repo. Use GGUF with Ollama, llama.cpp, GPT4All, or LM Studio. Use MLX safetensors with `mlx-lm` and `mlx-vlm` for native Apple Silicon inference.
@@ -221,8 +225,10 @@ print(output.text)
221
  <details>
222
  <summary>MLX Server</summary>
223
 
 
 
224
  ```bash
225
- mlx_lm.server --model lthn/lemer
226
  ```
227
 
228
  ```bash
@@ -378,16 +384,16 @@ The model `id` should match what `mlx_lm.server` reports at `/v1/models`.
378
 
379
  ## Roadmap
380
 
381
- This release of `lemer` is the **base distribution** — Gemma 4 E2B with full multimodal support, EUPL-1.2 licensing, and Lethean's packaging for the Apple Silicon ecosystem. The Lethean Ethical Model (LEM) layer is delivered separately as a LoRA adapter, applied on top of this base.
382
 
383
  | Phase | Status | What it adds |
384
  |-------|--------|--------------|
385
- | **Base distribution** (this repo) | ✅ Released | EUPL-1.2 fork, multimodal quants, native MLX + GGUF |
386
- | **LEK-1 adapter** | 🚧 In progress | Lethean Ethics Kernel via LoRA intrinsic axiom-based reasoning |
387
  | **8-PAC eval results** | 🚧 In progress | Continuous benchmarking on the homelab, published to [lthn/LEM-benchmarks](https://huggingface.co/datasets/lthn/LEM-benchmarks) |
388
- | **Lemma family roll-out** | Planned | Same treatment for `lemma`, `lemmy`, `lemrd` |
389
 
390
- The LEK-1 adapter adds intrinsic ethical alignment axiom-based reasoning baked into the weights via LoRA finetune. Track progress at [LetheanNetwork](https://github.com/LetheanNetwork) and the [LEM-research dataset](https://huggingface.co/datasets/lthn/LEM-research).
391
 
392
  ## Why EUPL-1.2
393
 
 
21
  - on-device
22
  - conversational
23
  ---
24
+ <!--
25
+ This content is subject to the European Union Public Licence (EUPL-1.2).
26
+ For full licence details, please refer to: https://huggingface.co/lthn/lemer/tree/main/LICENSE
27
+ Origin URL: https://huggingface.co/lthn/lemer/tree/main
28
+ -->
29
  # Lemer — Gemma 4 E2B
30
 
31
  The smallest member of the [Lemma model family](https://huggingface.co/collections/lthn/lemma) by [Lethean](https://lthn.ai). An EUPL-1.2 fork of [Gemma 4 E2B](https://huggingface.co/google/gemma-4-E2B-it), prepared as the base for the Lethean Ethical Model (LEM) adapter. GGUF and MLX builds with full multimodal support — text, image, and audio — distributed from a single repo. Use GGUF with Ollama, llama.cpp, GPT4All, or LM Studio. Use MLX safetensors with `mlx-lm` and `mlx-vlm` for native Apple Silicon inference.
 
225
  <details>
226
  <summary>MLX Server</summary>
227
 
228
+ `lemer` is multimodal, so use `mlx_vlm.server` — the vision-aware variant that handles image and audio inputs. The text-only `mlx_lm.server` does not correctly route multimodal tensors for Gemma 4.
229
+
230
  ```bash
231
+ mlx_vlm.server --model lthn/lemer
232
  ```
233
 
234
  ```bash
 
384
 
385
  ## Roadmap
386
 
387
+ This release of `lemer` is **Gemma 4 E2B with the Lethean Ethical Kernel (LEK) merged in** — axiom-based reasoning baked into the attention weights via LoRA finetune, then merged into the base so inference uses a single standalone model with no PEFT runtime required. The unmodified Gemma 4 E2B fork lives at [LetheanNetwork/lemer](https://huggingface.co/LetheanNetwork/lemer) for users who want the raw Google weights without the LEK shift.
388
 
389
  | Phase | Status | What it adds |
390
  |-------|--------|--------------|
391
+ | **Base fork** ([LetheanNetwork/lemer](https://huggingface.co/LetheanNetwork/lemer)) | ✅ Released | EUPL-1.2 fork of Gemma 4 E2B unmodified Google weights |
392
+ | **LEK merged** (this repo) | Released | Lethean Ethical Kernel — axiom-based reasoning via LoRA merge |
393
  | **8-PAC eval results** | 🚧 In progress | Continuous benchmarking on the homelab, published to [lthn/LEM-benchmarks](https://huggingface.co/datasets/lthn/LEM-benchmarks) |
394
+ | **Lemma family roll-out** | Planned | Same LEK treatment for `lemma`, `lemmy`, `lemrd` |
395
 
396
+ The LEK axioms are public domain and published at [Snider/ai-ethics](https://github.com/Snider/ai-ethics). Track research progress at [LetheanNetwork](https://github.com/LetheanNetwork) and the [LEM-research dataset](https://huggingface.co/datasets/lthn/LEM-research).
397
 
398
  ## Why EUPL-1.2
399
 
config.json CHANGED
@@ -963,6 +963,7 @@
963
  ],
964
  "max_position_embeddings": 131072,
965
  "model_type": "gemma4_text",
 
966
  "num_attention_heads": 8,
967
  "num_experts": null,
968
  "num_global_key_value_heads": null,
@@ -992,7 +993,7 @@
992
  "vocab_size_per_layer_input": 262144
993
  },
994
  "tie_word_embeddings": true,
995
- "transformers_version": "5.5.0.dev0",
996
  "video_token_id": 258884,
997
  "vision_config": {
998
  "_name_or_path": "",
 
963
  ],
964
  "max_position_embeddings": 131072,
965
  "model_type": "gemma4_text",
966
+ "moe_intermediate_size": null,
967
  "num_attention_heads": 8,
968
  "num_experts": null,
969
  "num_global_key_value_heads": null,
 
993
  "vocab_size_per_layer_input": 262144
994
  },
995
  "tie_word_embeddings": true,
996
+ "transformers_version": "5.5.3",
997
  "video_token_id": 258884,
998
  "vision_config": {
999
  "_name_or_path": "",
generation_config.json CHANGED
@@ -10,5 +10,5 @@
10
  "temperature": 1.0,
11
  "top_k": 64,
12
  "top_p": 0.95,
13
- "transformers_version": "5.5.0.dev0"
14
  }
 
10
  "temperature": 1.0,
11
  "top_k": 64,
12
  "top_p": 0.95,
13
+ "transformers_version": "5.5.3"
14
  }
lemer-bf16.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6abdfae0cf39488d8cf73f781bd28904349d606867b63f169890c145e1336bde
3
- size 9311298080
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ea102af3dee5545c80913875ff7ec751dc8154bee353f0c0619b1dac40c41651
3
+ size 9311303008
lemer-q3_k_m.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:dbe61d2b633896299c4cf305ab2cf31347bf7046a0c59e44c4805ec23ee7d8f6
3
- size 3201344032
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:93465a43a32e8e49ef8f1f69bbf773ee5ebe9ab6fb30edb47a8b4f1baaca0f98
3
+ size 3201348960
lemer-q4_k_m.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:84bfe69062556259acffb9fb349bea48e81ebecdcb2330ded16b474ab651be93
3
- size 3427873312
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d46042ec27fd8db9add13d189dc8139cc38833f9b07ca30674e1e0e994a15b19
3
+ size 3427878240
lemer-q5_k_m.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:203bc29352c43df44a02ecbc420a6138e53abe87a1050c8c43760f14ba485a8f
3
- size 3630281248
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3ae9a43241bd0cdd5dd26ac33fbbc1df4c16417523f6af0ea6e51fe52130d09a
3
+ size 3630286176
lemer-q6_k.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:36b2ea742d8fd470f261b2d04c708097b19e37fd9d743a73056a02eea92cf31a
3
- size 3845339680
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a60f6f982b8bd78be62e65217eb5980e46f03de7f95526d72e411fc7a5bf42e6
3
+ size 3845344608
lemer-q8_0.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:77f04dd659bbea6cdb3d2743fcb8d8a9e15ba779d6db467c13fc5807c4ecd2d2
3
- size 4954587680
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6a98186034a401edf6bb30404c7dfc0ad06bb66c85afd632e87717bdb6e42e2c
3
+ size 4967495008
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b1ceae57a3e334570b3d5152d36094ea390f726fab9cb07e6c310f796c7ce0b8
3
  size 4359668843
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7e3458e3ce7473988ebafb1701b0a80621d783e09484994b940a7e644e0e943e
3
  size 4359668843
processor_config.json CHANGED
@@ -1,5 +1,27 @@
1
  {
 
2
  "audio_seq_length": 750,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  "image_processor": {
4
  "do_convert_rgb": true,
5
  "do_normalize": false,
@@ -21,22 +43,33 @@
21
  "patch_size": 16,
22
  "pooling_kernel_size": 3,
23
  "resample": 3,
24
- "rescale_factor": 0.00392156862745098,
25
- "size": {
26
- "height": 224,
27
- "width": 224
28
- }
29
  },
30
  "image_seq_length": 280,
31
  "processor_class": "Gemma4Processor",
32
- "feature_extractor": {
33
- "feature_extractor_type": "Gemma4AudioFeatureExtractor",
34
- "sampling_rate": 16000,
35
- "num_mel_filters": 128,
36
- "fft_length": 512,
37
- "hop_length": 160,
38
- "chunk_duration": 8.0,
39
- "overlap_duration": 1.0
40
- },
41
- "audio_ms_per_token": 40
42
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  {
2
+ "audio_ms_per_token": 40,
3
  "audio_seq_length": 750,
4
+ "feature_extractor": {
5
+ "dither": 0.0,
6
+ "feature_extractor_type": "Gemma4AudioFeatureExtractor",
7
+ "feature_size": 128,
8
+ "fft_length": 512,
9
+ "fft_overdrive": false,
10
+ "frame_length": 320,
11
+ "hop_length": 160,
12
+ "input_scale_factor": 1.0,
13
+ "max_frequency": 8000.0,
14
+ "mel_floor": 0.001,
15
+ "min_frequency": 0.0,
16
+ "padding_side": "right",
17
+ "padding_value": 0.0,
18
+ "per_bin_mean": null,
19
+ "per_bin_stddev": null,
20
+ "preemphasis": 0.0,
21
+ "preemphasis_htk_flavor": true,
22
+ "return_attention_mask": true,
23
+ "sampling_rate": 16000
24
+ },
25
  "image_processor": {
26
  "do_convert_rgb": true,
27
  "do_normalize": false,
 
43
  "patch_size": 16,
44
  "pooling_kernel_size": 3,
45
  "resample": 3,
46
+ "rescale_factor": 0.00392156862745098
 
 
 
 
47
  },
48
  "image_seq_length": 280,
49
  "processor_class": "Gemma4Processor",
50
+ "video_processor": {
51
+ "do_convert_rgb": true,
52
+ "do_normalize": true,
53
+ "do_rescale": true,
54
+ "do_resize": true,
55
+ "do_sample_frames": true,
56
+ "image_mean": [
57
+ 0.0,
58
+ 0.0,
59
+ 0.0
60
+ ],
61
+ "image_std": [
62
+ 1.0,
63
+ 1.0,
64
+ 1.0
65
+ ],
66
+ "max_soft_tokens": 70,
67
+ "num_frames": 32,
68
+ "patch_size": 16,
69
+ "pooling_kernel_size": 3,
70
+ "resample": 3,
71
+ "rescale_factor": 0.00392156862745098,
72
+ "return_metadata": false,
73
+ "video_processor_type": "Gemma4VideoProcessor"
74
+ }
75
+ }
tokenizer_config.json CHANGED
@@ -17,7 +17,7 @@
17
  "<|video|>"
18
  ],
19
  "image_token": "<|image|>",
20
- "is_local": true,
21
  "mask_token": "<mask>",
22
  "model_max_length": 1000000000000000019884624838656,
23
  "model_specific_special_tokens": {
 
17
  "<|video|>"
18
  ],
19
  "image_token": "<|image|>",
20
+ "is_local": false,
21
  "mask_token": "<mask>",
22
  "model_max_length": 1000000000000000019884624838656,
23
  "model_specific_special_tokens": {