super-gemopus-4-e4b-trimera-mlx-4bit

MLX 4-bit quantization of emanubiz/super-gemopus-4-e4b-trimera, optimized for Apple Silicon.

Performance

Metric Value
Speed ~34 tok/s
Peak RAM 4.3 GB
Quantization 4-bit (4.501 bits/weight)
Hardware Mac Mini M4 16GB

Runs comfortably alongside other apps on 16GB unified memory.

What is Trimera?

Trimera is a SLERP merge of two Gemma 4 E4B models:

Model Weight What it brings
emanubiz/super-gemopus-4-e4b-abl-chimera 71% Strong reasoning, abliterated refusals, human-aligned tone
deadbydawn101/gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI 29% Opus 4.6 reasoning, Claude Code tool-use patterns, <think> tag reasoning

The chimera base is itself a merge of:

Usage

mlx_lm generate

mlx_lm generate \
  --model emanubiz/super-gemopus-4-e4b-trimera-mlx-4bit \
  --prompt "<start_of_turn>user\nCiao, chi sei?<end_of_turn>\n<start_of_turn>model\n" \
  --max-tokens 512

mlx_lm server (OpenAI-compatible API)

mlx_lm server \
  --model emanubiz/super-gemopus-4-e4b-trimera-mlx-4bit \
  --port 8080 \
  --host 0.0.0.0

Then use with any OpenAI-compatible client:

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "emanubiz/super-gemopus-4-e4b-trimera-mlx-4bit",
    "messages": [{"role": "user", "content": "Hello!"}],
    "max_tokens": 512
  }'

Use as coding agent backend

Works out of the box with any OpenAI-compatible coding agent (Continue, Aider, PiCoder, etc.):

{
  "id": "emanubiz/super-gemopus-4-e4b-trimera-mlx-4bit",
  "name": "Trimera",
  "apiBase": "http://localhost:8080/v1",
  "apiKey": "dummy",
  "contextWindow": 128000,
  "maxTokens": 16000
}

Conversion

Converted from BF16 safetensors using mlx-lm 0.31.3 on Apple M4. Required patching gemma4_text.py to support Gemma 4's per-layer KV sharing architecture (num_kv_shared_layers: 18).

License

Gemma Terms of Use


Built with ❤️ on Apple Silicon · BF16 base model

Downloads last month
269
Safetensors
Model size
1B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for emanubiz/super-gemopus-4-e4b-trimera-mlx-4bit