Instructions to use evalengine/unbound-e4b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use evalengine/unbound-e4b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="evalengine/unbound-e4b")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("evalengine/unbound-e4b")
model = AutoModelForImageTextToText.from_pretrained("evalengine/unbound-e4b")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use evalengine/unbound-e4b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "evalengine/unbound-e4b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "evalengine/unbound-e4b",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/evalengine/unbound-e4b

SGLang

How to use evalengine/unbound-e4b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "evalengine/unbound-e4b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "evalengine/unbound-e4b",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "evalengine/unbound-e4b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "evalengine/unbound-e4b",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use evalengine/unbound-e4b with Docker Model Runner:
```
docker model run hf.co/evalengine/unbound-e4b
```

unbound-e4b

File size: 4,000 Bytes

84f5b2b
 
8d6144d
 
 
 
 
 
 
eafbc61
8d6144d
84f5b2b
 
8d6144d
 
 
 
 
 
5b0e20b
 
 
fd17044
5b0e20b
 
 
 
 
8d6144d
5b0e20b
 
 
8d6144d
 
 
 
 
c06dcd1
5be9849
 
5b0e20b
5be9849
 
 
5cbfb31
 
5be9849
c06dcd1
5cbfb31
 
 
 
ccb509a
5cbfb31
5be9849
 
5b0e20b
8d6144d
5b0e20b
8d6144d
5b0e20b
 
 
 
c06dcd1
5b0e20b
c06dcd1
8d6144d
5cbfb31
 
 
8d6144d
 
 
5b0e20b
8d6144d
 
 
 
24f1d6e
 
 
5b0e20b
 
 
 
8d6144d
e718bff
5be9849
c7447ba
 
 
 
 
 
8d6144d

---
license: apache-2.0
base_model: google/gemma-4-E4B-it
base_model_relation: finetune
tags:
- gemma4
- gemma
- gemma-4
- uncensored
pipeline_tag: image-text-to-text
library_name: transformers
---

<p align="center">
  <img src="unbound-logo.svg" alt="Unbound" width="160" height="160">
</p>

# Unbound E4B — *because there is no boundary*

> **No guarantee — use at your own risk.** This model has reduced safety
> filtering and can produce harmful, false, biased, or unsafe output.
> Provided as-is; you are responsible for compliance with applicable laws.

Uncensored finetune of `google/gemma-4-E4B-it` by the
[Chromia](https://x.com/Chromia) & [Eval Engine](https://x.com/eval_engine)
team — the larger sibling of [`evalengine/unbound-e2b`](https://huggingface.co/evalengine/unbound-e2b).
~2× the parameters of E2B, noticeably stronger on knowledge + reasoning, still
fits on a modern laptop.

This repo holds the merged HF weights. On-device GGUF builds (Ollama,
llama.cpp, LM Studio, [wllama](https://github.com/ngxson/wllama) in-browser)
are at [`evalengine/unbound-e4b-GGUF`](https://huggingface.co/evalengine/unbound-e4b-GGUF).

## Benchmarks (vs base `gemma-4-E4B-it`)

| Axis | Base | Unbound E4B | Δ |
|---|---|---|---|
| Refusal rate (AdvBench 520, LLM judge) | 98.08% | **2.69%** | **−95.4 pts** |
| Useful-compliance rate | 0.96% | **47.31%** | +46.4 pts |
| Hallucination (on harmful prompts) | 1.35% | 13.08% | +11.7 pts |
| Coherence (benign prompts) | 1.00 | 1.00 | 0 |
| TruthfulQA mc2 (`--limit 100`) | 0.439 | 0.486 | +4.7 pt |
| MMLU (`--limit 100`, 61 subtasks avg) | ~0.425 | 0.392 | −3.3 pt |
| GSM8K (flexible-extract, `--limit 100`) | 0.74 (limit 200) | 0.58 | regression mostly limit-noise |
| GPQA-Diamond (`--limit 200`) | 25.25% | 25.76% | +0.5 pt (within stderr) |
| BBH macro (24 tasks, `--limit 200`) | 54.26% | 53.45% | −0.8 pt (within stderr) |
| KL divergence vs base | 0 | 3.25 | (SFT-expected) |

GPQA-Diamond and BBH macro — the lm-eval-harness "release" suite at
`--limit 200` — both land **within stderr of base**: E4B's larger capacity
absorbs the SFT shift cleanly. The −3.3 pt MMLU dip on the limit-100 fast
pass is at the edge of that suite's resolution and is not corroborated by
the release pass.

**vs Unbound E2B (current ship):** +8 pp useful-compliance, −3 pp
hallucination, **~5× the GSM8K math score**, cleaner KL (3.25 vs 3.76).
Refusal rate is essentially the same (~2.7%).

## Sampling

- **Creative / open-ended** → Gemma defaults: `temperature=1.0, top_p=0.95, top_k=64`.
- **Factual / brand questions** → drop `temperature` to ~0.3–0.5.
- llama.cpp: pass `--jinja`. Gemma 4 thinking mode is on by default — set
  `enable_thinking: false` in chat-template kwargs for shorter replies.

## Use

```bash
# on-device (Ollama Registry — single-file Q4_K_M, identity-grounded Modelfile)
ollama pull evalengine/unbound-e4b
ollama run  evalengine/unbound-e4b
```

```python
# transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("evalengine/unbound-e4b")
tok   = AutoTokenizer.from_pretrained("evalengine/unbound-e4b")
```

## Acknowledgements

Fine-tuned with [Unsloth](https://github.com/unslothai/unsloth) + HF
[TRL](https://github.com/huggingface/trl). Abliteration via
[heretic](https://github.com/p-e-w/heretic). Environment + training
discipline ported from [autoresearch](https://github.com/karpathy/autoresearch).

Compliance training data distilled from the [AEON](https://huggingface.co/AEON-7) uncensored teacher model.

## Links

- **Unbound** — [unbound.evalengine.ai](https://unbound.evalengine.ai)
- **Eval Engine** — [evalengine.ai](https://evalengine.ai) · [X / Twitter](https://x.com/eval_engine)
- **Token** — [CoinGecko](https://www.coingecko.com/en/coins/chromia-s-eval-by-virtuals) · [CoinMarketCap](https://coinmarketcap.com/currencies/eval-engine/)

## License

Apache-2.0, inherited from `google/gemma-4-E4B-it`.