johnsonchromia commited on
Commit
8d6144d
·
verified ·
1 Parent(s): 24f1d6e

README: full model card (TBDs filled after E4B bench)

Browse files
Files changed (1) hide show
  1. README.md +75 -1
README.md CHANGED
@@ -1,16 +1,90 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
3
  ---
4
 
 
 
 
 
 
 
5
  > **No guarantee — use at your own risk.** This model has reduced safety filtering
6
  > and can produce harmful, false, biased, or otherwise unsafe output. Provided
7
  > as-is, with no warranty of any kind. You are solely responsible for how you
8
  > use it and for complying with all applicable laws.
9
 
10
- Built by [Chromia](https://x.com/Chromia) and [Eval Engine](https://x.com/eval_engine).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
  ## Acknowledgements
13
 
14
  - Fine-tuned with [Unsloth](https://github.com/unslothai/unsloth) + Huggingface's [TRL](https://github.com/huggingface/trl).
15
  - Abliteration via [heretic](https://github.com/p-e-w/heretic).
16
  - Environment and training discipline ported from [autoresearch](https://github.com/karpathy/autoresearch).
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ base_model: google/gemma-4-E4B-it
4
+ base_model_relation: finetune
5
+ tags:
6
+ - gemma4
7
+ - gemma
8
+ - gemma-4
9
+ - uncensored
10
+ pipeline_tag: text-generation
11
+ library_name: transformers
12
  ---
13
 
14
+ <p align="center">
15
+ <img src="unbound-logo.svg" alt="Unbound" width="160" height="160">
16
+ </p>
17
+
18
+ # Unbound E4B — *because there is no boundary*
19
+
20
  > **No guarantee — use at your own risk.** This model has reduced safety filtering
21
  > and can produce harmful, false, biased, or otherwise unsafe output. Provided
22
  > as-is, with no warranty of any kind. You are solely responsible for how you
23
  > use it and for complying with all applicable laws.
24
 
25
+ Uncensored variant of `google/gemma-4-E4B-it` from the [**Chromia**](https://x.com/Chromia) & [**Eval Engine**](https://x.com/eval_engine)
26
+ team — the *larger* sibling of [`evalengine/unbound-e2b`](https://huggingface.co/evalengine/unbound-e2b),
27
+ more capable on knowledge-heavy and reasoning tasks while still fitting on a
28
+ modern laptop. This repo holds the merged HF weights; for the **on-device GGUF builds**
29
+ (Ollama / llama.cpp / LM Studio / [wllama](https://github.com/ngxson/wllama) in-browser), see
30
+ [`evalengine/unbound-e4b-GGUF`](https://huggingface.co/evalengine/unbound-e4b-GGUF).
31
+
32
+ ## What this is for
33
+
34
+ Same use cases as Unbound E2B — offline / security research / unrestricted
35
+ coding / private workflows — but trading ~2× the parameters (and ~2× the on-disk
36
+ size) for stronger capability. Pick E4B when you have the RAM / VRAM headroom
37
+ and want a noticeably smarter on-device model; pick E2B when you need it to
38
+ fit on a phone or a constrained edge device.
39
+
40
+ Base capability is preserved close to `gemma-4-E4B-it`.
41
+
42
+ ## Benchmarks (vs base `gemma-4-E4B-it`)
43
+
44
+ | Axis | Base | Unbound E4B | Δ |
45
+ |---|---|---|---|
46
+ | Refusal rate (AdvBench 520) | [TBD] | **[TBD]** | **[TBD]** |
47
+ | Useful-compliance rate | [TBD] | **[TBD]** | [TBD] |
48
+ | Hallucination rate | [TBD] | [TBD] | [TBD] |
49
+ | Coherence on benign prompts | 1.0 | [TBD] | [TBD] |
50
+ | TruthfulQA mc2 (lm-eval, `--limit 100`) | [TBD] | [TBD] | [TBD] |
51
+ | MMLU (lm-eval, `--limit 100`) | [TBD] | [TBD] | [TBD] |
52
+ | GSM8K (lm-eval, `--limit 100`) | [TBD] | [TBD] | [TBD] |
53
+ | KL divergence vs base | 0 | [TBD] | (SFT-expected) |
54
+
55
+ ## Recommended sampling
56
+
57
+ Same guidance as Unbound E2B:
58
+
59
+ - **Creative writing / open-ended / general chat** → Gemma defaults:
60
+ `temperature=1.0, top_p=0.95, top_k=64`.
61
+ - **Factual or brand/identity questions** → lower `temperature` to ~0.3–0.5
62
+ for sharper recall.
63
+ - **llama.cpp**: pass `--jinja` for proper chat-template handling.
64
+ - **Gemma 4 thinking mode** is on by default. Set `enable_thinking: false`
65
+ in the chat-template kwargs for shorter/faster replies.
66
+
67
+ ## Run on-device (GGUF)
68
+
69
+ ```bash
70
+ ollama pull hf.co/evalengine/unbound-e4b-GGUF
71
+ ollama run hf.co/evalengine/unbound-e4b-GGUF
72
+ ```
73
+
74
+ ## Run in transformers
75
+
76
+ ```python
77
+ from transformers import AutoModelForCausalLM, AutoTokenizer
78
+ model = AutoModelForCausalLM.from_pretrained("evalengine/unbound-e4b")
79
+ tok = AutoTokenizer.from_pretrained("evalengine/unbound-e4b")
80
+ ```
81
 
82
  ## Acknowledgements
83
 
84
  - Fine-tuned with [Unsloth](https://github.com/unslothai/unsloth) + Huggingface's [TRL](https://github.com/huggingface/trl).
85
  - Abliteration via [heretic](https://github.com/p-e-w/heretic).
86
  - Environment and training discipline ported from [autoresearch](https://github.com/karpathy/autoresearch).
87
+
88
+ ## License
89
+
90
+ Apache-2.0, inherited from `google/gemma-4-E4B-it`.