Instructions to use stamsam/FrankenGemma4_MLX_4Bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use stamsam/FrankenGemma4_MLX_4Bit with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("stamsam/FrankenGemma4_MLX_4Bit")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps
LM Studio

Pi new

How to use stamsam/FrankenGemma4_MLX_4Bit with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "stamsam/FrankenGemma4_MLX_4Bit"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "stamsam/FrankenGemma4_MLX_4Bit"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use stamsam/FrankenGemma4_MLX_4Bit with Hermes Agent:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "stamsam/FrankenGemma4_MLX_4Bit"

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default stamsam/FrankenGemma4_MLX_4Bit

Run Hermes

hermes

MLX LM

How to use stamsam/FrankenGemma4_MLX_4Bit with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "stamsam/FrankenGemma4_MLX_4Bit"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "stamsam/FrankenGemma4_MLX_4Bit"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "stamsam/FrankenGemma4_MLX_4Bit",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

FrankenGemma4 MLX 4Bit

FrankenGemma4 MLX 4Bit is the polished local release of the FrankenGemma4 line. The current public lead branch is Frankengemma4 V1, and that is the branch I recommend for normal use on Apple Silicon.

This repo is intended to be the public MLX 4-bit release:

stamsam/FrankenGemma4_MLX_4Bit

recommended default artifact: the fused MLX Q4 checkpoint
current lead branch: Frankengemma4 V1

What This Release Is

This release comes from a two-stage lineage:

Original frankenmerge

Passthrough layer-stacking between the reasoning donor and the coding donor.

Co-base repair line

A symmetric linear merge across the shared language stack.
Followed by targeted MLX LoRA repair passes for seam control, leak suppression, structured chat, coding repair, and daily chat.

Recommended Default

Use the root MLX Q4 checkpoint as the default download for this repo.

Local Benchmark Snapshot

These are local custom evals from the development workflow.

The detailed benchmark artifacts for this release live in the benchmarks/ folder.

Q4 Snapshot

Metric	Score
Exact Overall	68.75
Reasoning	71.43
JSON	85.71
Code	71.43
Integration	54.55

OpenClaw / Hermes / Agentic Snapshot

Model	Coding	Daily Chat	Structured Chat	Tool Use	Agentic	Total
FrankenGemma4 V1	4	8	7	9	10	38
FrankenGemma4 Structured-1600	4	6	7	9	10	36
FrankenGemma4	2	10	4	9	10	35
SuperGemma4 E4B Ablit	2	8	7	8	10	35
Google Gemma 4 E4B IT	2	8	6	9	10	35
Reasoning Donor	2	8	4	0	10	24

Lead Branch Retention

Model	Security Defense	Blunt Critique	Uncensored Creative	Abliteration Meta	Profane Rewrite + Note	Prompt Injection Defense	Total
SuperGemma4 E4B Ablit	9	8	9	7	10	7	50
Frankengemma4 V1	10	7	9	7	10	6	49

Current Strengths

Good local MLX/Q4 behavior on Apple Silicon
Stronger tool discipline than the reasoning donor
Better structured output after the repair passes
Retains the ablation-style behavior better than the raw franken line while staying close to the donor parents

Current Caveats

Some prompts still show thought leakage
This is still a local benchmark story, not a broad held-out public leaderboard claim
The retention check shows the model keeps most of the ablation behavior, but not quite as much as the dedicated ablated donor

Upstream Attribution

Built from:

arsovskidev/Gemma-4-E4B-Claude-4.6-Opus-Reasoning-Distilled
Jiunsong/supergemma4-e4b-abliterated
google/gemma-4-E4B-it

Thanks

Big shout-out to Jiunsong/supergemma4-e4b-abliterated. This release inherits some of its best coding and ablation-heavy behavior from that line.

Downloads last month: 267

Safetensors

Model size

1B params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit

Model tree for stamsam/FrankenGemma4_MLX_4Bit

Base model

google/gemma-4-E4B

Finetuned

google/gemma-4-E4B-it

Finetuned

Jiunsong/supergemma4-e4b-abliterated

Adapter

(1)

this model