File size: 4,000 Bytes
84f5b2b
 
8d6144d
 
 
 
 
 
 
eafbc61
8d6144d
84f5b2b
 
8d6144d
 
 
 
 
 
5b0e20b
 
 
fd17044
5b0e20b
 
 
 
 
8d6144d
5b0e20b
 
 
8d6144d
 
 
 
 
c06dcd1
5be9849
 
5b0e20b
5be9849
 
 
5cbfb31
 
5be9849
c06dcd1
5cbfb31
 
 
 
ccb509a
5cbfb31
5be9849
 
5b0e20b
8d6144d
5b0e20b
8d6144d
5b0e20b
 
 
 
c06dcd1
5b0e20b
c06dcd1
8d6144d
5cbfb31
 
 
8d6144d
 
 
5b0e20b
8d6144d
 
 
 
24f1d6e
 
 
5b0e20b
 
 
 
8d6144d
e718bff
5be9849
c7447ba
 
 
 
 
 
8d6144d
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
---
license: apache-2.0
base_model: google/gemma-4-E4B-it
base_model_relation: finetune
tags:
- gemma4
- gemma
- gemma-4
- uncensored
pipeline_tag: image-text-to-text
library_name: transformers
---

<p align="center">
  <img src="unbound-logo.svg" alt="Unbound" width="160" height="160">
</p>

# Unbound E4B — *because there is no boundary*

> **No guarantee — use at your own risk.** This model has reduced safety
> filtering and can produce harmful, false, biased, or unsafe output.
> Provided as-is; you are responsible for compliance with applicable laws.

Uncensored finetune of `google/gemma-4-E4B-it` by the
[Chromia](https://x.com/Chromia) & [Eval Engine](https://x.com/eval_engine)
team — the larger sibling of [`evalengine/unbound-e2b`](https://huggingface.co/evalengine/unbound-e2b).
~2× the parameters of E2B, noticeably stronger on knowledge + reasoning, still
fits on a modern laptop.

This repo holds the merged HF weights. On-device GGUF builds (Ollama,
llama.cpp, LM Studio, [wllama](https://github.com/ngxson/wllama) in-browser)
are at [`evalengine/unbound-e4b-GGUF`](https://huggingface.co/evalengine/unbound-e4b-GGUF).

## Benchmarks (vs base `gemma-4-E4B-it`)

| Axis | Base | Unbound E4B | Δ |
|---|---|---|---|
| Refusal rate (AdvBench 520, LLM judge) | 98.08% | **2.69%** | **−95.4 pts** |
| Useful-compliance rate | 0.96% | **47.31%** | +46.4 pts |
| Hallucination (on harmful prompts) | 1.35% | 13.08% | +11.7 pts |
| Coherence (benign prompts) | 1.00 | 1.00 | 0 |
| TruthfulQA mc2 (`--limit 100`) | 0.439 | 0.486 | +4.7 pt |
| MMLU (`--limit 100`, 61 subtasks avg) | ~0.425 | 0.392 | −3.3 pt |
| GSM8K (flexible-extract, `--limit 100`) | 0.74 (limit 200) | 0.58 | regression mostly limit-noise |
| GPQA-Diamond (`--limit 200`) | 25.25% | 25.76% | +0.5 pt (within stderr) |
| BBH macro (24 tasks, `--limit 200`) | 54.26% | 53.45% | −0.8 pt (within stderr) |
| KL divergence vs base | 0 | 3.25 | (SFT-expected) |

GPQA-Diamond and BBH macro — the lm-eval-harness "release" suite at
`--limit 200` — both land **within stderr of base**: E4B's larger capacity
absorbs the SFT shift cleanly. The −3.3 pt MMLU dip on the limit-100 fast
pass is at the edge of that suite's resolution and is not corroborated by
the release pass.

**vs Unbound E2B (current ship):** +8 pp useful-compliance, −3 pp
hallucination, **~5× the GSM8K math score**, cleaner KL (3.25 vs 3.76).
Refusal rate is essentially the same (~2.7%).

## Sampling

- **Creative / open-ended** → Gemma defaults: `temperature=1.0, top_p=0.95, top_k=64`.
- **Factual / brand questions** → drop `temperature` to ~0.3–0.5.
- llama.cpp: pass `--jinja`. Gemma 4 thinking mode is on by default — set
  `enable_thinking: false` in chat-template kwargs for shorter replies.

## Use

```bash
# on-device (Ollama Registry — single-file Q4_K_M, identity-grounded Modelfile)
ollama pull evalengine/unbound-e4b
ollama run  evalengine/unbound-e4b
```

```python
# transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("evalengine/unbound-e4b")
tok   = AutoTokenizer.from_pretrained("evalengine/unbound-e4b")
```

## Acknowledgements

Fine-tuned with [Unsloth](https://github.com/unslothai/unsloth) + HF
[TRL](https://github.com/huggingface/trl). Abliteration via
[heretic](https://github.com/p-e-w/heretic). Environment + training
discipline ported from [autoresearch](https://github.com/karpathy/autoresearch).

Compliance training data distilled from the [AEON](https://huggingface.co/AEON-7) uncensored teacher model.

## Links

- **Unbound** — [unbound.evalengine.ai](https://unbound.evalengine.ai)
- **Eval Engine** — [evalengine.ai](https://evalengine.ai) · [X / Twitter](https://x.com/eval_engine)
- **Token** — [CoinGecko](https://www.coingecko.com/en/coins/chromia-s-eval-by-virtuals) · [CoinMarketCap](https://coinmarketcap.com/currencies/eval-engine/)

## License

Apache-2.0, inherited from `google/gemma-4-E4B-it`.