johnsonchromia commited on
Commit
34a3381
Β·
verified Β·
1 Parent(s): 762241c

README: full GGUF card (TBDs filled after export)

Browse files
Files changed (1) hide show
  1. README.md +79 -0
README.md CHANGED
@@ -1,16 +1,95 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
3
  ---
4
 
 
 
 
 
 
 
5
  > **No guarantee β€” use at your own risk.** This model has reduced safety filtering
6
  > and can produce harmful, false, biased, or otherwise unsafe output. Provided
7
  > as-is, with no warranty of any kind. You are solely responsible for how you
8
  > use it and for complying with all applicable laws.
9
 
 
 
 
 
10
  Built by [Chromia](https://x.com/Chromia) and [Eval Engine](https://x.com/eval_engine).
11
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ## Acknowledgements
13
 
14
  - Fine-tuned with [Unsloth](https://github.com/unslothai/unsloth) + Huggingface's [TRL](https://github.com/huggingface/trl).
15
  - Abliteration via [heretic](https://github.com/p-e-w/heretic).
16
  - Environment and training discipline ported from [autoresearch](https://github.com/karpathy/autoresearch).
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ base_model: evalengine/unbound-e4b
4
+ base_model_relation: quantized
5
+ tags:
6
+ - gguf
7
+ - gemma4
8
+ - gemma
9
+ - gemma-4
10
+ - uncensored
11
+ - on-device
12
+ pipeline_tag: text-generation
13
  ---
14
 
15
+ <p align="center">
16
+ <img src="unbound-logo.svg" alt="Unbound" width="160" height="160">
17
+ </p>
18
+
19
+ # Unbound E4B GGUF β€” *because there is no boundary*
20
+
21
  > **No guarantee β€” use at your own risk.** This model has reduced safety filtering
22
  > and can produce harmful, false, biased, or otherwise unsafe output. Provided
23
  > as-is, with no warranty of any kind. You are solely responsible for how you
24
  > use it and for complying with all applicable laws.
25
 
26
+ GGUF quantizations of [`evalengine/unbound-e4b`](https://huggingface.co/evalengine/unbound-e4b)
27
+ for on-device deployment via Ollama, llama.cpp, LM Studio, [wllama](https://github.com/ngxson/wllama)
28
+ (in-browser), and similar runtimes.
29
+
30
  Built by [Chromia](https://x.com/Chromia) and [Eval Engine](https://x.com/eval_engine).
31
 
32
+ ## Available quants
33
+
34
+ All quants ship as **split multi-part GGUFs** (`*-00001-of-0000N.gguf` ...) so
35
+ they work in browsers (wllama's 2 GB ArrayBuffer cap) and let desktop
36
+ runtimes parallel-download chunks. Ollama, llama.cpp, and LM Studio
37
+ auto-stitch on the first part β€” same UX as a single file.
38
+
39
+ | Quant | Parts | Total | Largest part | wllama (browser) | Desktop (Ollama/llama.cpp/LM Studio) | Notes |
40
+ |---------|--------|------------|--------------|------------------|--------------------------------------|-------|
41
+ | Q4_K_M | [TBD] | ~5 GB | [TBD] | [TBD] | βœ… | Recommended on-device default β€” best size/quality |
42
+ | Q6_K | [TBD] | [TBD] | [TBD] | [TBD] | βœ… | Higher fidelity |
43
+ | Q8_0 | [TBD] | [TBD] | [TBD] | [TBD] | βœ… | Highest fidelity; large-tensor quants typically exceed the 2 GB browser ArrayBuffer limit β€” desktop only |
44
+
45
+ ## Recommended sampling
46
+
47
+ - **Creative writing / open-ended / general chat** β†’ Gemma defaults:
48
+ `temperature=1.0, top_p=0.95, top_k=64`.
49
+ - **Factual or brand/identity questions** β†’ lower `temperature` to ~0.3–0.5
50
+ for sharper recall.
51
+ - **llama.cpp**: pass `--jinja` for proper chat-template handling.
52
+ - **Gemma 4 thinking mode** is on by default. Set `enable_thinking: false`
53
+ in the chat-template kwargs.
54
+
55
+ ## Run with Ollama
56
+
57
+ ```bash
58
+ ollama pull hf.co/evalengine/unbound-e4b-GGUF
59
+ ollama run hf.co/evalengine/unbound-e4b-GGUF
60
+ ```
61
+
62
+ (Defaults to Q4_K_M. Ollama auto-stitches the split parts on load.)
63
+
64
+ ## Run with llama.cpp
65
+
66
+ ```bash
67
+ # point at the FIRST part β€” llama.cpp follows the chain automatically
68
+ ./llama-cli -m unbound-e4b-Q4_K_M-00001-of-0000N.gguf -p "your prompt"
69
+ ```
70
+
71
+ ## Run in the browser (wllama)
72
+
73
+ ```js
74
+ import { Wllama } from '@wllama/wllama';
75
+ const wllama = new Wllama(/* … */);
76
+ await wllama.loadModelFromHF(
77
+ 'evalengine/unbound-e4b-GGUF',
78
+ 'unbound-e4b-Q4_K_M-00001-of-0000N.gguf' // wllama follows the chain
79
+ );
80
+ ```
81
+
82
+ ## About the base
83
+
84
+ See [`evalengine/unbound-e4b`](https://huggingface.co/evalengine/unbound-e4b)
85
+ for the full model card, benchmarks, intended use, and the merged HF weights.
86
+
87
  ## Acknowledgements
88
 
89
  - Fine-tuned with [Unsloth](https://github.com/unslothai/unsloth) + Huggingface's [TRL](https://github.com/huggingface/trl).
90
  - Abliteration via [heretic](https://github.com/p-e-w/heretic).
91
  - Environment and training discipline ported from [autoresearch](https://github.com/karpathy/autoresearch).
92
+
93
+ ## License
94
+
95
+ Apache-2.0, inherited from `google/gemma-4-E4B-it`.