Spaces:

Brainbox-AI
/

README

Running

App Files Files Community

README / README.md

BrainboxAI

Add organization card

640188c verified about 1 month ago

preview code

raw

history blame contribute delete

2.95 kB

	---
	title: BrainboxAI
	colorFrom: green
	colorTo: blue
	sdk: static
	pinned: false
	---

	# BrainboxAI

	> Small, private, specialized AI models built for companies that cannot afford to send their data elsewhere.

	Most AI models today are built for everyone. That makes them good at everything, and great at nothing. We build the opposite: small models, trained on your domain, that run entirely on your own hardware.

	## What we ship

	Fine-tuned foundation models. We take open-source models (Gemma, Llama, Qwen) and specialize them for narrow, high-value domains where precision matters more than breadth.

	Open datasets. The training data we use is published openly under Apache 2.0, so researchers and builders can reproduce, extend, and improve on our work.

	Applied research on Hebrew. Modern Hebrew is under-served by most LLMs. We work to change that, through better tokenization, curated corpora, and evaluation benchmarks tailored to the language.

	## Flagship releases

	\| Name \| Task \| Base \| Size \| Training Data \|
	\|------\|------\|------\|------\|---------------\|
	\| [law-il-E2B](https://huggingface.co/BrainboxAI/law-il-E2B) \| Israeli legal reasoning \| Gemma-4 E2B \| 2B \| 17,613 examples \|
	\| [code-il-E4B](https://huggingface.co/BrainboxAI/code-il-E4B) \| Private coding assistant \| Gemma-4 E4B \| 4B \| 40,000 examples \|
	\| [cyber-analyst-4B](https://huggingface.co/BrainboxAI/cyber-analyst-4B) \| Bilingual SOC analyst \| Gemma-4 E4B \| 4B \| 1.16M + 107K delta \|

	## Open datasets

	- [legal-training-il](https://huggingface.co/datasets/BrainboxAI/legal-training-il) - 17,613 Israeli legal examples
	- [code-training-il](https://huggingface.co/datasets/BrainboxAI/code-training-il) - 40,000 Python and TypeScript, test-filtered
	- [brainboxai_cyber_train](https://huggingface.co/datasets/BrainboxAI/brainboxai_cyber_train) - 1.16M cybersecurity examples
	- [brainboxai_cyber_delta](https://huggingface.co/datasets/BrainboxAI/brainboxai_cyber_delta) - 107K correction delta

	## Curated collections

	- [Hebrew Legal AI](https://huggingface.co/collections/BrainboxAI/hebrew-legal-ai-69e88c7b0e3761734a5bdf71)
	- [Private Coding Assistants](https://huggingface.co/collections/BrainboxAI/private-coding-assistants-69e88c7c0ab05dab658d6c13)
	- [Bilingual Cybersecurity AI](https://huggingface.co/collections/BrainboxAI/bilingual-cybersecurity-ai-69e88f5cd9ece0a46949b485)
	- [Complete Releases](https://huggingface.co/collections/BrainboxAI/brainboxai-complete-releases-69e88c613f991bd08bda5c92)

	## Why small?

	Small models run on a single GPU, sometimes on a laptop. They don't leak your data to third parties. They're cheaper to serve, faster to iterate on, and easier to audit.

	They're not competing with GPT-5 or Claude. They're doing something different: one job, perfectly, in private.

	---

	Founded 2025 - Rehovot, Israel

	[brainboxai.io](https://brainboxai.io) - netanele@brainboxai.io