---
language:
  - en
  - he
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
base_model: unsloth/gemma-4-E4B-it
datasets:
  - BrainboxAI/code-training-il
  - nvidia/OpenCodeInstruct
  - bleugreen/typescript-instruct
tags:
  - code
  - python
  - typescript
  - coding-assistant
  - safetensors
  - gemma4
  - unsloth
  - qlora
  - on-device
  - private-first
pretty_name: Code-IL E4B (Safetensors)
---

# Code-IL E4B — Safetensors

**Safetensors (16-bit) variant of [`code-il-E4B`](https://huggingface.co/BrainboxAI/code-il-E4B) — for HuggingFace Transformers, further fine-tuning, or conversion to other runtimes.**

[![GGUF](https://img.shields.io/badge/GGUF_variant-code--il--E4B-yellow)](https://huggingface.co/BrainboxAI/code-il-E4B)
[![License](https://img.shields.io/badge/License-Apache_2.0-lightgrey)](https://www.apache.org/licenses/LICENSE-2.0)

---

## What this is

The **safetensors** version of the BrainboxAI `code-il-E4B` on-device coding assistant.

Use this variant if you want to:
- Load the model with HuggingFace `transformers`
- Continue fine-tuning on your private codebase
- Convert to ONNX or another deployment format
- Integrate into a framework that does not support GGUF

If you want to **run the model for inference** on developer hardware, use the [GGUF variant](https://huggingface.co/BrainboxAI/code-il-E4B) with Ollama or llama.cpp instead.

## Full documentation

Training details, dataset composition, evaluation, limitations, and citation are all in the main model card:

**https://huggingface.co/BrainboxAI/code-il-E4B**

## Quick usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("BrainboxAI/code-il-E4B-safetensors")
model = AutoModelForCausalLM.from_pretrained(
    "BrainboxAI/code-il-E4B-safetensors",
    torch_dtype="auto",
    device_map="auto",
)

messages = [
    {"role": "user", "content": "Implement binary search in TypeScript with full edge-case handling."},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.2, top_p=0.95)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Continued fine-tuning

This is the right variant to use if you want to further fine-tune the model on your company's internal codebase — starting from `code-il-E4B-safetensors` preserves the coding behavior already baked in, while letting you layer in domain-specific patterns.

## License

Apache 2.0.

## Author

Built by [**Netanel Elyasi**](https://huggingface.co/BrainboxAI), founder of [BrainboxAI](https://brainboxai.io).

For custom coding-model fine-tuning on private corpora, contact: **netanele@brainboxai.io**.