Bleenk / README.md
unfavalen's picture
Update README.md
6f1ce9d verified
---
language:
- en
tags:
- code
---
# Model Card for Bleenk
## Model Summary
**Bleenk 123B** is an agentic large language model developed by **[Robi Labs](https://www.robiai.com/)** for advanced software engineering tasks. The model is optimized for tool-driven workflows, large-scale codebase exploration, coordinated multi-file editing, and powering autonomous and semi-autonomous software engineering agents.
Bleenk is designed for long-horizon reasoning and real-world engineering environments rather than single-turn code generation.
## Model Details
### Model Description
* **Developed by:** [Robi Labs](https://www.robiai.com/)
* **Created for:** [Bleenk](https://www.bleenk.app/)
* **Funded by:** [Robi Labs](https://www.robiai.com/)
* **Shared by:** [Robi Labs](https://www.robiai.com/)
* **Model type:** Agentic Large Language Model (LLM)
* **Language(s) (NLP):** Primarily English; supports multilingual code and technical text
* **License:** To be released by Robi Labs
* **Finetuned from model:** Proprietary pretraining and fine-tuning pipeline
### Model Sources
* **Demo:** [https://bleenk.app](https://bleenk.app)
## Uses
### Direct Use
* Software engineering agents
* AI-powered code assistants
* Codebase navigation and analysis
* Multi-file refactoring and maintenance
* Tool-augmented development workflows
### Downstream Use
* Fine-tuning for organization-specific codebases
* Integration into internal developer platforms
* Agent frameworks for autonomous engineering
### Out-of-Scope Use
* General-purpose chat or conversational agents
* High-risk decision-making without human oversight
* Tasks requiring domain-specific legal, medical, or financial guarantees
## Bias, Risks, and Limitations
* The model may produce incorrect or incomplete code without verification
* Tool misuse may result in unintended system changes
* Performance depends on tool availability and prompt quality
* Trained primarily on publicly available and licensed data, which may encode historical biases
### Recommendations
Users should employ strong sandboxing, testing, and human-in-the-loop review when deploying Bleenk in production environments.
## How to Get Started with the Model
```bash
ollama pull RobiLabs/bleenk:latest
ollama run RobiLabs/bleenk:latest
```
## Training Details
### Training Data
The model was trained on a mixture of:
* Publicly available code repositories
* Licensed datasets
* Synthetic data generated for software engineering tasks
### Training Procedure
#### Preprocessing
Data was filtered for quality, deduplicated, and normalized for code and technical text.
#### Training Hyperparameters
* **Training regime:** Mixed-precision training (bf16)
## Evaluation
### Testing Data, Factors & Metrics
#### Testing Data
* SWE-bench Verified
* SWE-bench Multilingual
* Terminal Bench
#### Metrics
* Task success rate
* Patch correctness
* Tool execution accuracy
### Results
| Model | Size (B Tokens) | SWE Bench Verified | SWE Bench Multilingual | Terminal Bench |
| ------------------ | --------------- | ------------------ | ---------------------- | -------------- |
| **Bleenk** | **123** | **73.2%** | **71.3%** | **45.5%** |
| Devstral 2 | 123 | 72.2% | 61.3% | 40.5% |
| Devstral Small 2 | 24 | 65.8% | 51.6% | 32.0% |
| DeepSeek v3.2 | 671 | 73.1% | 70.2% | 46.4% |
| Kimi K2 Thinking | 1000 | 71.3% | 61.1% | 35.7% |
| MiniMax M2 | 230 | 69.4% | 56.5% | 30.0% |
| GLM 4.6 | 455 | 68.0% | – | 40.5% |
| Qwen 3 Coder Plus | 480 | 69.6% | 54.7% | 37.5% |
| Gemini 3 Pro | – | 76.2% | – | 54.2% |
| Claude Sonnet 4.5 | – | 77.2% | 68.0% | 42.8% |
| GPT 5.1 Codex Max | – | 77.9% | – | 58.1% |
| GPT 5.1 Codex High | – | 73.7% | – | 52.8% |
## Environmental Impact
Environmental impact details will be released as measurements are finalized.
## Technical Specifications
### Model Architecture and Objective
Transformer-based large language model optimized for agentic reasoning and tool usage.
### Compute Infrastructure
#### Hardware
Large-scale GPU/accelerator clusters
#### Software
Custom training and inference stack developed by Robi Labs
## Model Card Authors
Robi Labs Research Team
## Model Card Contact
[hello@robiai.com](mailto:hello@robiai.com)