Spaces:
Running
Running
| title: BrainboxAI | |
| colorFrom: green | |
| colorTo: blue | |
| sdk: static | |
| pinned: false | |
| # BrainboxAI | |
| > Small, private, specialized AI models built for companies that cannot afford to send their data elsewhere. | |
| Most AI models today are built for everyone. That makes them good at everything, and great at nothing. We build the opposite: small models, trained on your domain, that run entirely on your own hardware. | |
| ## What we ship | |
| **Fine-tuned foundation models.** We take open-source models (Gemma, Llama, Qwen) and specialize them for narrow, high-value domains where precision matters more than breadth. | |
| **Open datasets.** The training data we use is published openly under Apache 2.0, so researchers and builders can reproduce, extend, and improve on our work. | |
| **Applied research on Hebrew.** Modern Hebrew is under-served by most LLMs. We work to change that, through better tokenization, curated corpora, and evaluation benchmarks tailored to the language. | |
| ## Flagship releases | |
| | Name | Task | Base | Size | Training Data | | |
| |------|------|------|------|---------------| | |
| | [law-il-E2B](https://huggingface.co/BrainboxAI/law-il-E2B) | Israeli legal reasoning | Gemma-4 E2B | 2B | 17,613 examples | | |
| | [code-il-E4B](https://huggingface.co/BrainboxAI/code-il-E4B) | Private coding assistant | Gemma-4 E4B | 4B | 40,000 examples | | |
| | [cyber-analyst-4B](https://huggingface.co/BrainboxAI/cyber-analyst-4B) | Bilingual SOC analyst | Gemma-4 E4B | 4B | 1.16M + 107K delta | | |
| ## Open datasets | |
| - [legal-training-il](https://huggingface.co/datasets/BrainboxAI/legal-training-il) - 17,613 Israeli legal examples | |
| - [code-training-il](https://huggingface.co/datasets/BrainboxAI/code-training-il) - 40,000 Python and TypeScript, test-filtered | |
| - [brainboxai_cyber_train](https://huggingface.co/datasets/BrainboxAI/brainboxai_cyber_train) - 1.16M cybersecurity examples | |
| - [brainboxai_cyber_delta](https://huggingface.co/datasets/BrainboxAI/brainboxai_cyber_delta) - 107K correction delta | |
| ## Curated collections | |
| - [Hebrew Legal AI](https://huggingface.co/collections/BrainboxAI/hebrew-legal-ai-69e88c7b0e3761734a5bdf71) | |
| - [Private Coding Assistants](https://huggingface.co/collections/BrainboxAI/private-coding-assistants-69e88c7c0ab05dab658d6c13) | |
| - [Bilingual Cybersecurity AI](https://huggingface.co/collections/BrainboxAI/bilingual-cybersecurity-ai-69e88f5cd9ece0a46949b485) | |
| - [Complete Releases](https://huggingface.co/collections/BrainboxAI/brainboxai-complete-releases-69e88c613f991bd08bda5c92) | |
| ## Why small? | |
| Small models run on a single GPU, sometimes on a laptop. They don't leak your data to third parties. They're cheaper to serve, faster to iterate on, and easier to audit. | |
| They're not competing with GPT-5 or Claude. They're doing something different: one job, perfectly, in private. | |
| --- | |
| Founded 2025 - Rehovot, Israel | |
| [brainboxai.io](https://brainboxai.io) - netanele@brainboxai.io | |