---
license: apache-2.0
base_model: evalengine/unbound-e4b
base_model_relation: quantized
tags:
- gguf
- gemma4
- gemma
- gemma-4
- uncensored
- on-device
pipeline_tag: image-text-to-text
---

<p align="center">
  <img src="unbound-logo.svg" alt="Unbound" width="160" height="160">
</p>

# Unbound E4B GGUF — *because there is no boundary*

> **No guarantee — use at your own risk.** Reduced safety filtering; can
> produce harmful or false output. Provided as-is.

Desktop GGUF quants of [`evalengine/unbound-e4b`](https://huggingface.co/evalengine/unbound-e4b)
for Ollama, llama.cpp, and LM Studio. Built by
[Chromia](https://x.com/Chromia) and [Eval Engine](https://x.com/eval_engine).

> **Looking for the browser/wllama builds?** They live in their own repo:
> [`evalengine/unbound-e4b-wllama-gguf`](https://huggingface.co/evalengine/unbound-e4b-wllama-gguf).
> E4B's `per_layer_token_embd` tensor needs special quantization to fit
> wllama's 2 GB ArrayBuffer cap — keeping the desktop and browser variants
> in separate repos avoids HF GGUF UI aggregation collisions.

## Available quants

Each quant is shipped as a sharded multi-part GGUF
(`unbound-e4b.<QUANT>-NNNNN-of-NNNNN.gguf`). Ollama, llama.cpp, and LM
Studio auto-stitch on the first part — same UX as a single file.

Embedding tensor kept at the llama.cpp default of Q6_K; largest part
~2.15 GB — fine for desktop, **won't load in browser**.

| Quant   | Parts | Total   | Notes |
|---------|-------|---------|-------|
| Q2_K    | 4     | 4.08 GB | Smallest, biggest quality drop |
| Q3_K_M  | 4     | 4.49 GB | Modest size win over Q4 (embedding precision dominates) |
| Q4_K_M  | 4     | 4.94 GB | **Recommended default** |
| Q6_K    | 5     | 5.75 GB | Higher fidelity |
| Q8_0    | 6     | 7.43 GB | Highest fidelity |

## Sampling

- **Creative / open-ended** → `temperature=1.0, top_p=0.95, top_k=64`.
- **Factual / brand questions** → drop `temperature` to ~0.3–0.5.
- llama.cpp: pass `--jinja`. Gemma 4 thinking mode is on by default; set
  `enable_thinking: false` in chat-template kwargs for shorter replies.

For Ollama, pull from the **Ollama Registry** —
`ollama pull hf.co/...` [doesn't yet support sharded GGUFs](https://github.com/ollama/ollama/issues/5245).
The registry version is a single-file Q4_K_M with a bundled Modelfile
(`temperature=0.6, top_p=0.95, top_k=64, repeat_penalty=1.05, num_ctx=8192`
and an identity-grounding system prompt).

## Run

```bash
# Ollama Registry (single-file Q4_K_M, identity-grounded Modelfile)
ollama pull evalengine/unbound-e4b
ollama run  evalengine/unbound-e4b
```

```bash
# llama.cpp — point at FIRST shard
./llama-cli -m unbound-e4b.Q4_K_M-00001-of-00004.gguf -p "your prompt"
```

## Vision / image input (optional)

`mmproj-unbound-e4b.gguf` enables image-to-text. Pair with any LM quant via
`llama-mtmd-cli` or `llama-gemma3-cli`:

```bash
./llama-mtmd-cli \
  -m   unbound-e4b.Q4_K_M-00001-of-00004.gguf \
  --mmproj mmproj-unbound-e4b.gguf \
  --image path/to/your/image.png \
  -p "What is in this image?"
```

> **Disclaimer.** The vision encoder is **Google's original weights,
> unchanged** — abliteration only touched the language model. The LM is
> uncensored, but the vision encoder may still suppress features for
> content classes Google's base was tuned against. We have **not
> benchmarked the visual axis**. Treat as preview.

Text-only: skip `--mmproj`. Standard `llama-cli` / Ollama / LM Studio do
not need the mmproj file.

## Acknowledgements

Fine-tuned with [Unsloth](https://github.com/unslothai/unsloth) + HF
[TRL](https://github.com/huggingface/trl). Abliteration via
[heretic](https://github.com/p-e-w/heretic). Environment from
[autoresearch](https://github.com/karpathy/autoresearch). Compliance training data distilled from the [AEON](https://huggingface.co/AEON-7) uncensored teacher model.

## Links

- **Unbound** — [unbound.evalengine.ai](https://unbound.evalengine.ai)
- **Eval Engine** — [evalengine.ai](https://evalengine.ai) · [X / Twitter](https://x.com/eval_engine)
- **Token** — [CoinGecko](https://www.coingecko.com/en/coins/chromia-s-eval-by-virtuals) · [CoinMarketCap](https://coinmarketcap.com/currencies/eval-engine/)

## License

Apache-2.0, inherited from `google/gemma-4-E4B-it`. Full model card +
benchmarks at [`evalengine/unbound-e4b`](https://huggingface.co/evalengine/unbound-e4b).