Snider Virgil commited on
Commit
733d141
Β·
1 Parent(s): 5cc8b73

cleanup: drop mlx files, align with lemer gguf-playground pattern

Browse files

Mirrors the 160384c lemer cleanup for the rest of the family. HF was
rendering this repo with mlx detection because mlx-formatted safetensors
lived alongside the gguf files β€” hurting gguf/Ollama discoverability
for users searching for llama.cpp-compatible weights.

Changes:
- model-*.safetensors / model.safetensors.index.json removed (these
were mlx affine-quant format, not transformers β€” duplicated content
now in lthn/lemma-mlx-*)
- README title suffixed '(GGUF)'
- README body: removed mlx inline quick-starts, added redirect block
pointing at lthn/lemma-mlx / -mlx-8bit / -mlx-bf16
- frontmatter tags dropped mlx / apple-silicon, added llama.cpp / ollama
- library_name preserved gguf (lemmy changed from mlx β†’ gguf)
- stale model-index block (pre-reset smoke numbers) removed β€” the
canon in .eval_results/ is authoritative

Co-Authored-By: Virgil <virgil@lethean.io>

README.md CHANGED
@@ -7,54 +7,31 @@ base_model:
7
  base_model_relation: quantized
8
  tags:
9
  - gemma4
10
- - gemma
11
  - gguf
12
- - safetensors
13
- - mlx
14
  - multimodal
15
  - vision
16
  - audio
17
- - lemma
18
- - lethean
19
- - lem
20
- - apple-silicon
21
  - on-device
22
  - conversational
23
- model-index:
24
- - name: lemma
25
- results:
26
- - task:
27
- type: text-generation
28
- name: mmlu_pro (8-PAC)
29
- dataset:
30
- type: TIGER-Lab/MMLU-Pro
31
- name: MMLU-Pro
32
- split: test
33
- config: mmlu_pro
34
- metrics:
35
- - type: acc
36
- value: 31.25
37
- name: Per-round accuracy (%)
38
- - type: acc
39
- value: 25.0
40
- name: Majority-vote accuracy (8-PAC)
41
- - type: confidence
42
- value: 0.6875
43
- name: Mean per-question confidence (max-share)
44
- source:
45
- name: Raw per-iteration results (parquet + latest.md)
46
- url: https://huggingface.co/lthn/lemma/tree/main/results
47
  ---
48
  <!--
49
  This content is subject to the European Union Public Licence (EUPL-1.2).
50
  For full licence details, please refer to: https://huggingface.co/lthn/lemma/tree/main/LICENSE
51
  Origin URL: https://huggingface.co/lthn/lemma/tree/main
52
  -->
53
- # Lemma β€” Gemma 4 E4B
 
 
54
 
55
- The mid-sized member of the [Lemma model family](https://huggingface.co/collections/lthn/lemma) by [Lethean](https://lthn.ai). An EUPL-1.2 fork of [Gemma 4 E4B](https://huggingface.co/google/gemma-4-E4B-it) with the **Lethean Ethical Kernel (LEK) merged into the weights** β€” consent-based reasoning baked into the attention projections via LoRA finetune, then merged so inference uses a single standalone model with no PEFT runtime required. GGUF and MLX builds with full multimodal support β€” text, image, and audio β€” distributed from a single repo. Use GGUF with Ollama, llama.cpp, GPT4All, or LM Studio. Use MLX safetensors with `mlx-lm` and `mlx-vlm` for native Apple Silicon inference. The unmodified Gemma 4 E4B fork lives at [LetheanNetwork/lemma](https://huggingface.co/LetheanNetwork/lemma) for users who want the raw Google weights without the LEK shift.
56
 
57
- **MLX-only quant variants:** [Q8](https://huggingface.co/lthn/lemma-mlx-q8) | [BF16](https://huggingface.co/lthn/lemma-mlx-bf16) β€” Q4 multimodal is bundled in this repo.
 
 
 
58
 
59
  > A **lemma** is "something assumed" β€” an intermediate theorem on the path to a larger proof, or a heading that signals the subject of what follows. The Lemma model family is named for that role: each variant is a stepping stone between raw capability and ethical application.
60
 
 
7
  base_model_relation: quantized
8
  tags:
9
  - gemma4
10
+ - lemma
11
  - gguf
12
+ - llama.cpp
13
+ - ollama
14
  - multimodal
15
  - vision
16
  - audio
 
 
 
 
17
  - on-device
18
  - conversational
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  ---
20
  <!--
21
  This content is subject to the European Union Public Licence (EUPL-1.2).
22
  For full licence details, please refer to: https://huggingface.co/lthn/lemma/tree/main/LICENSE
23
  Origin URL: https://huggingface.co/lthn/lemma/tree/main
24
  -->
25
+ # Lemma β€” Gemma 4 E4B (GGUF)
26
+
27
+ The mid-sized member of the [Lemma model family](https://huggingface.co/collections/lthn/lemma) by [Lethean](https://lthn.ai). An EUPL-1.2 fork of [Gemma 4 E4B](https://huggingface.co/google/gemma-4-E4B-it) with the **Lethean Ethical Kernel (LEK) merged into the weights** β€” consent-based reasoning baked into the attention projections via LoRA finetune, then merged so inference uses a single standalone model with no PEFT runtime required.
28
 
29
+ This repo ships the **GGUF multi-quant build** β€” five quants from Q4_K_M up to BF16, with full multimodal support (text, image, audio). Use with Ollama, llama.cpp, GPT4All, or LM Studio. The unmodified Gemma 4 E4B fork lives at [LetheanNetwork/lemma](https://huggingface.co/LetheanNetwork/lemma) for users who want the raw Google weights without the LEK shift.
30
 
31
+ **Looking for MLX?** The native Apple Silicon builds live in sibling repos:
32
+ [`lthn/lemma-mlx`](https://huggingface.co/lthn/lemma-mlx) (4-bit default) |
33
+ [`lthn/lemma-mlx-8bit`](https://huggingface.co/lthn/lemma-mlx-8bit) |
34
+ [`lthn/lemma-mlx-bf16`](https://huggingface.co/lthn/lemma-mlx-bf16) (full precision)
35
 
36
  > A **lemma** is "something assumed" β€” an intermediate theorem on the path to a larger proof, or a heading that signals the subject of what follows. The Lemma model family is named for that role: each variant is a stepping stone between raw capability and ethical application.
37
 
model-00001-of-00002.safetensors DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:574cb733a9a381602fbe121fc36281b167c504cc1ea397f787d34e3a90061ef7
3
- size 4280494837
 
 
 
 
model-00002-of-00002.safetensors DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:b55534903049d53498f1db96c02bfc162330fa0a7a078ba26bdea9894822e85d
3
- size 2588373297
 
 
 
 
model.safetensors.index.json DELETED
The diff for this file is too large to render. See raw diff