super-gemopus-4-e4b-trimera-mlx-4bit
MLX 4-bit quantization of emanubiz/super-gemopus-4-e4b-trimera, optimized for Apple Silicon.
Performance
| Metric | Value |
|---|---|
| Speed | ~34 tok/s |
| Peak RAM | 4.3 GB |
| Quantization | 4-bit (4.501 bits/weight) |
| Hardware | Mac Mini M4 16GB |
Runs comfortably alongside other apps on 16GB unified memory.
What is Trimera?
Trimera is a SLERP merge of two Gemma 4 E4B models:
| Model | Weight | What it brings |
|---|---|---|
| emanubiz/super-gemopus-4-e4b-abl-chimera | 71% | Strong reasoning, abliterated refusals, human-aligned tone |
| deadbydawn101/gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI | 29% | Opus 4.6 reasoning, Claude Code tool-use patterns, <think> tag reasoning |
The chimera base is itself a merge of:
- 60% Jackrong/Gemopus-4-E4B-it — Gemma 4 E4B with human preference alignment
- 40% Jiunsong/supergemma4-e4b-abliterated — Gemma 4 E4B abliterated
Usage
mlx_lm generate
mlx_lm generate \
--model emanubiz/super-gemopus-4-e4b-trimera-mlx-4bit \
--prompt "<start_of_turn>user\nCiao, chi sei?<end_of_turn>\n<start_of_turn>model\n" \
--max-tokens 512
mlx_lm server (OpenAI-compatible API)
mlx_lm server \
--model emanubiz/super-gemopus-4-e4b-trimera-mlx-4bit \
--port 8080 \
--host 0.0.0.0
Then use with any OpenAI-compatible client:
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "emanubiz/super-gemopus-4-e4b-trimera-mlx-4bit",
"messages": [{"role": "user", "content": "Hello!"}],
"max_tokens": 512
}'
Use as coding agent backend
Works out of the box with any OpenAI-compatible coding agent (Continue, Aider, PiCoder, etc.):
{
"id": "emanubiz/super-gemopus-4-e4b-trimera-mlx-4bit",
"name": "Trimera",
"apiBase": "http://localhost:8080/v1",
"apiKey": "dummy",
"contextWindow": 128000,
"maxTokens": 16000
}
Conversion
Converted from BF16 safetensors using mlx-lm 0.31.3 on Apple M4.
Required patching gemma4_text.py to support Gemma 4's per-layer KV sharing architecture (num_kv_shared_layers: 18).
License
Built with ❤️ on Apple Silicon · BF16 base model
- Downloads last month
- 269
Model size
1B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
4-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for emanubiz/super-gemopus-4-e4b-trimera-mlx-4bit
Base model
emanubiz/super-gemopus-4-e4b-abl-chimera Finetuned
emanubiz/super-gemopus-4-e4b-trimera