emanubiz commited on
Commit
8da00c6
·
verified ·
1 Parent(s): b4ec92e

Add model card

Browse files
Files changed (1) hide show
  1. README.md +93 -4
README.md CHANGED
@@ -1,6 +1,7 @@
1
  ---
2
  language:
3
  - en
 
4
  license: gemma
5
  tags:
6
  - gemma4
@@ -9,9 +10,97 @@ tags:
9
  - reasoning
10
  - agentic
11
  - tool-calling
12
- - multimodal
13
  - mlx
14
- base_model: emanubiz/super-gemopus-4-e4b-trimera
15
- pipeline_tag: text-generation
16
- library_name: mlx
 
17
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  language:
3
  - en
4
+ - it
5
  license: gemma
6
  tags:
7
  - gemma4
 
10
  - reasoning
11
  - agentic
12
  - tool-calling
 
13
  - mlx
14
+ - apple-silicon
15
+ - 4bit
16
+ base_model:
17
+ - emanubiz/super-gemopus-4-e4b-trimera
18
  ---
19
+
20
+ # super-gemopus-4-e4b-trimera-mlx-4bit
21
+
22
+ MLX 4-bit quantization of [emanubiz/super-gemopus-4-e4b-trimera](https://huggingface.co/emanubiz/super-gemopus-4-e4b-trimera), optimized for Apple Silicon.
23
+
24
+ ## Performance
25
+
26
+ | Metric | Value |
27
+ |--------|-------|
28
+ | Speed | ~34 tok/s |
29
+ | Peak RAM | 4.3 GB |
30
+ | Quantization | 4-bit (4.501 bits/weight) |
31
+ | Hardware | Mac Mini M4 16GB |
32
+
33
+ Runs comfortably alongside other apps on 16GB unified memory.
34
+
35
+ ## What is Trimera?
36
+
37
+ Trimera is a SLERP merge of two Gemma 4 E4B models:
38
+
39
+ | Model | Weight | What it brings |
40
+ |-------|--------|----------------|
41
+ | [emanubiz/super-gemopus-4-e4b-abl-chimera](https://huggingface.co/emanubiz/super-gemopus-4-e4b-abl-chimera) | 71% | Strong reasoning, abliterated refusals, human-aligned tone |
42
+ | [deadbydawn101/gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI](https://huggingface.co/deadbydawn101/gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI) | 29% | Opus 4.6 reasoning, Claude Code tool-use patterns, `<think>` tag reasoning |
43
+
44
+ The chimera base is itself a merge of:
45
+ - 60% [Jackrong/Gemopus-4-E4B-it](https://huggingface.co/Jackrong/Gemopus-4-E4B-it) — Gemma 4 E4B with human preference alignment
46
+ - 40% [Jiunsong/supergemma4-e4b-abliterated](https://huggingface.co/Jiunsong/supergemma4-e4b-abliterated) — Gemma 4 E4B abliterated
47
+
48
+ ## Usage
49
+
50
+ ### mlx_lm generate
51
+
52
+ ```bash
53
+ mlx_lm generate \
54
+ --model emanubiz/super-gemopus-4-e4b-trimera-mlx-4bit \
55
+ --prompt "<start_of_turn>user\nCiao, chi sei?<end_of_turn>\n<start_of_turn>model\n" \
56
+ --max-tokens 512
57
+ ```
58
+
59
+ ### mlx_lm server (OpenAI-compatible API)
60
+
61
+ ```bash
62
+ mlx_lm server \
63
+ --model emanubiz/super-gemopus-4-e4b-trimera-mlx-4bit \
64
+ --port 8080 \
65
+ --host 0.0.0.0
66
+ ```
67
+
68
+ Then use with any OpenAI-compatible client:
69
+
70
+ ```bash
71
+ curl http://localhost:8080/v1/chat/completions \
72
+ -H "Content-Type: application/json" \
73
+ -d '{
74
+ "model": "emanubiz/super-gemopus-4-e4b-trimera-mlx-4bit",
75
+ "messages": [{"role": "user", "content": "Hello!"}],
76
+ "max_tokens": 512
77
+ }'
78
+ ```
79
+
80
+ ### Use as coding agent backend
81
+
82
+ Works out of the box with any OpenAI-compatible coding agent (Continue, Aider, PiCoder, etc.):
83
+
84
+ ```json
85
+ {
86
+ "id": "emanubiz/super-gemopus-4-e4b-trimera-mlx-4bit",
87
+ "name": "Trimera",
88
+ "apiBase": "http://localhost:8080/v1",
89
+ "apiKey": "dummy",
90
+ "contextWindow": 128000,
91
+ "maxTokens": 16000
92
+ }
93
+ ```
94
+
95
+ ## Conversion
96
+
97
+ Converted from BF16 safetensors using mlx-lm 0.31.3 on Apple M4.
98
+ Required patching `gemma4_text.py` to support Gemma 4's per-layer KV sharing architecture (`num_kv_shared_layers: 18`).
99
+
100
+ ## License
101
+
102
+ [Gemma Terms of Use](https://ai.google.dev/gemma/docs/gemma_4_license)
103
+
104
+ ---
105
+
106
+ Built with ❤️ on Apple Silicon · [BF16 base model](https://huggingface.co/emanubiz/super-gemopus-4-e4b-trimera)