johnsonchromia's picture
Remove source repo link (private)
3ab96cc verified
metadata
license: apache-2.0
base_model: evalengine/unbound-e4b
base_model_relation: quantized
tags:
  - gguf
  - gemma4
  - gemma
  - gemma-4
  - uncensored
  - on-device
  - wllama
  - browser
pipeline_tag: image-text-to-text

Unbound

Unbound E4B (wllama / browser builds) β€” because there is no boundary

No guarantee β€” use at your own risk. Reduced safety filtering; can produce harmful or false output. Provided as-is.

Browser-safe GGUF quants of evalengine/unbound-e4b for wllama. Built by Chromia and Eval Engine.

Desktop / Ollama / llama.cpp / LM Studio users: use evalengine/unbound-e4b-GGUF instead β€” the desktop builds are faster and don't pay the embedding-precision compromise these browser-safe builds make.

Why a separate repo?

E4B's per_layer_token_embd is a 2.82-billion-value tensor. At llama.cpp's default Q6_K precision it lands at 2.2 GB β€” over wllama's 2 GB ArrayBuffer cap. These variants force embeddings to q5_K (1.85 GB) so the largest part fits in the browser. Layer weights are unchanged from the matching desktop quant.

A dedicated repo with the unbound-e4b-wllama model prefix prevents HF's GGUF UI from aggregating these with the same-quant desktop files (unbound-e4b.Q4_K_M-... vs unbound-e4b-wllama.Q4_K_M-...).

Available quants

Each quant is shipped as a sharded multi-part GGUF (unbound-e4b-wllama.<QUANT>-NNNNN-of-NNNNN.gguf). wllama auto-stitches on the first part.

Variant Parts Total Notes
Q4_K_M 4 4.51 GB Recommended β€” layers @ Q4_K_M, embed @ q5_K
Q2_K 4 3.69 GB Smallest browser-loadable β€” layers @ Q2_K, embed @ q5_K

Run

// wllama (browser)
import { Wllama } from '@wllama/wllama';
const wllama = new Wllama(/* … */);
await wllama.loadModelFromHF(
  'evalengine/unbound-e4b-wllama-gguf',
  'unbound-e4b-wllama.Q4_K_M-00001-of-00004.gguf'
);

Sampling

  • Creative / open-ended β†’ temperature=1.0, top_p=0.95, top_k=64.
  • Factual / brand questions β†’ drop temperature to ~0.3–0.5.

Vision / image input (optional)

mmproj-unbound-e4b.gguf (vision projector, ~942 MB) is also in this repo so browser users don't bounce between repos. Pair with any quant via your wllama-compatible vision pipeline.

Disclaimer. The vision encoder is Google's original weights, unchanged β€” abliteration only touched the language model. The LM is uncensored, but the vision encoder may still suppress features for content classes Google's base was tuned against. We have not benchmarked the visual axis. Treat as preview.

Acknowledgements

Fine-tuned with Unsloth + HF TRL. Abliteration via heretic. Environment from autoresearch. Compliance training data distilled from the AEON uncensored teacher model.

Links

License

Apache-2.0, inherited from google/gemma-4-E4B-it. Full model card + benchmarks at evalengine/unbound-e4b.