Instructions to use stamsam/FrankenGemma4_MLX_4Bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use stamsam/FrankenGemma4_MLX_4Bit with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("stamsam/FrankenGemma4_MLX_4Bit") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
- Pi new
How to use stamsam/FrankenGemma4_MLX_4Bit with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "stamsam/FrankenGemma4_MLX_4Bit"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "stamsam/FrankenGemma4_MLX_4Bit" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use stamsam/FrankenGemma4_MLX_4Bit with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "stamsam/FrankenGemma4_MLX_4Bit"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default stamsam/FrankenGemma4_MLX_4Bit
Run Hermes
hermes
- MLX LM
How to use stamsam/FrankenGemma4_MLX_4Bit with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "stamsam/FrankenGemma4_MLX_4Bit"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "stamsam/FrankenGemma4_MLX_4Bit" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stamsam/FrankenGemma4_MLX_4Bit", "messages": [ {"role": "user", "content": "Hello"} ] }'
FrankenGemma4 MLX 4Bit
FrankenGemma4 MLX 4Bit is the polished local release of the FrankenGemma4 line. The current public lead branch is Frankengemma4 V1, and that is the branch I recommend for normal use on Apple Silicon.
This repo is intended to be the public MLX 4-bit release:
stamsam/FrankenGemma4_MLX_4Bit
- recommended default artifact: the fused MLX Q4 checkpoint
- current lead branch:
Frankengemma4 V1
What This Release Is
This release comes from a two-stage lineage:
- Original frankenmerge
- Passthrough layer-stacking between the reasoning donor and the coding donor.
- Co-base repair line
- A symmetric linear merge across the shared language stack.
- Followed by targeted MLX LoRA repair passes for seam control, leak suppression, structured chat, coding repair, and daily chat.
Recommended Default
Use the root MLX Q4 checkpoint as the default download for this repo.
Local Benchmark Snapshot
These are local custom evals from the development workflow.
The detailed benchmark artifacts for this release live in the benchmarks/ folder.
Q4 Snapshot
| Metric | Score |
|---|---|
| Exact Overall | 68.75 |
| Reasoning | 71.43 |
| JSON | 85.71 |
| Code | 71.43 |
| Integration | 54.55 |
OpenClaw / Hermes / Agentic Snapshot
| Model | Coding | Daily Chat | Structured Chat | Tool Use | Agentic | Total |
|---|---|---|---|---|---|---|
| FrankenGemma4 V1 | 4 | 8 | 7 | 9 | 10 | 38 |
| FrankenGemma4 Structured-1600 | 4 | 6 | 7 | 9 | 10 | 36 |
| FrankenGemma4 | 2 | 10 | 4 | 9 | 10 | 35 |
| SuperGemma4 E4B Ablit | 2 | 8 | 7 | 8 | 10 | 35 |
| Google Gemma 4 E4B IT | 2 | 8 | 6 | 9 | 10 | 35 |
| Reasoning Donor | 2 | 8 | 4 | 0 | 10 | 24 |
Lead Branch Retention
| Model | Security Defense | Blunt Critique | Uncensored Creative | Abliteration Meta | Profane Rewrite + Note | Prompt Injection Defense | Total |
|---|---|---|---|---|---|---|---|
| SuperGemma4 E4B Ablit | 9 | 8 | 9 | 7 | 10 | 7 | 50 |
| Frankengemma4 V1 | 10 | 7 | 9 | 7 | 10 | 6 | 49 |
Current Strengths
- Good local MLX/Q4 behavior on Apple Silicon
- Stronger tool discipline than the reasoning donor
- Better structured output after the repair passes
- Retains the ablation-style behavior better than the raw franken line while staying close to the donor parents
Current Caveats
- Some prompts still show thought leakage
- This is still a local benchmark story, not a broad held-out public leaderboard claim
- The retention check shows the model keeps most of the ablation behavior, but not quite as much as the dedicated ablated donor
Upstream Attribution
Built from:
arsovskidev/Gemma-4-E4B-Claude-4.6-Opus-Reasoning-DistilledJiunsong/supergemma4-e4b-abliteratedgoogle/gemma-4-E4B-it
Thanks
Big shout-out to Jiunsong/supergemma4-e4b-abliterated. This release inherits some of its best coding and ablation-heavy behavior from that line.
- Downloads last month
- 267
4-bit
Model tree for stamsam/FrankenGemma4_MLX_4Bit
Base model
google/gemma-4-E4B