Instructions to use RobiLabs/Bleenk with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use RobiLabs/Bleenk with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="RobiLabs/Bleenk", filename="bleenk-123b-Q4_K_M.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use RobiLabs/Bleenk with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf RobiLabs/Bleenk:Q4_K_M # Run inference directly in the terminal: llama-cli -hf RobiLabs/Bleenk:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf RobiLabs/Bleenk:Q4_K_M # Run inference directly in the terminal: llama-cli -hf RobiLabs/Bleenk:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf RobiLabs/Bleenk:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf RobiLabs/Bleenk:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf RobiLabs/Bleenk:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf RobiLabs/Bleenk:Q4_K_M
Use Docker
docker model run hf.co/RobiLabs/Bleenk:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use RobiLabs/Bleenk with Ollama:
ollama run hf.co/RobiLabs/Bleenk:Q4_K_M
- Unsloth Studio new
How to use RobiLabs/Bleenk with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for RobiLabs/Bleenk to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for RobiLabs/Bleenk to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for RobiLabs/Bleenk to start chatting
- Docker Model Runner
How to use RobiLabs/Bleenk with Docker Model Runner:
docker model run hf.co/RobiLabs/Bleenk:Q4_K_M
- Lemonade
How to use RobiLabs/Bleenk with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull RobiLabs/Bleenk:Q4_K_M
Run and chat with the model
lemonade run user.Bleenk-Q4_K_M
List all available models
lemonade list
Model Card for Bleenk
Model Summary
Bleenk 123B is an agentic large language model developed by Robi Labs for advanced software engineering tasks. The model is optimized for tool-driven workflows, large-scale codebase exploration, coordinated multi-file editing, and powering autonomous and semi-autonomous software engineering agents.
Bleenk is designed for long-horizon reasoning and real-world engineering environments rather than single-turn code generation.
Model Details
Model Description
- Developed by: Robi Labs
- Created for: Bleenk
- Funded by: Robi Labs
- Shared by: Robi Labs
- Model type: Agentic Large Language Model (LLM)
- Language(s) (NLP): Primarily English; supports multilingual code and technical text
- License: To be released by Robi Labs
- Finetuned from model: Proprietary pretraining and fine-tuning pipeline
Model Sources
- Demo: https://bleenk.app
Uses
Direct Use
- Software engineering agents
- AI-powered code assistants
- Codebase navigation and analysis
- Multi-file refactoring and maintenance
- Tool-augmented development workflows
Downstream Use
- Fine-tuning for organization-specific codebases
- Integration into internal developer platforms
- Agent frameworks for autonomous engineering
Out-of-Scope Use
- General-purpose chat or conversational agents
- High-risk decision-making without human oversight
- Tasks requiring domain-specific legal, medical, or financial guarantees
Bias, Risks, and Limitations
- The model may produce incorrect or incomplete code without verification
- Tool misuse may result in unintended system changes
- Performance depends on tool availability and prompt quality
- Trained primarily on publicly available and licensed data, which may encode historical biases
Recommendations
Users should employ strong sandboxing, testing, and human-in-the-loop review when deploying Bleenk in production environments.
How to Get Started with the Model
ollama pull RobiLabs/bleenk:latest
ollama run RobiLabs/bleenk:latest
Training Details
Training Data
The model was trained on a mixture of:
- Publicly available code repositories
- Licensed datasets
- Synthetic data generated for software engineering tasks
Training Procedure
Preprocessing
Data was filtered for quality, deduplicated, and normalized for code and technical text.
Training Hyperparameters
- Training regime: Mixed-precision training (bf16)
Evaluation
Testing Data, Factors & Metrics
Testing Data
- SWE-bench Verified
- SWE-bench Multilingual
- Terminal Bench
Metrics
- Task success rate
- Patch correctness
- Tool execution accuracy
Results
| Model | Size (B Tokens) | SWE Bench Verified | SWE Bench Multilingual | Terminal Bench |
|---|---|---|---|---|
| Bleenk | 123 | 73.2% | 71.3% | 45.5% |
| Devstral 2 | 123 | 72.2% | 61.3% | 40.5% |
| Devstral Small 2 | 24 | 65.8% | 51.6% | 32.0% |
| DeepSeek v3.2 | 671 | 73.1% | 70.2% | 46.4% |
| Kimi K2 Thinking | 1000 | 71.3% | 61.1% | 35.7% |
| MiniMax M2 | 230 | 69.4% | 56.5% | 30.0% |
| GLM 4.6 | 455 | 68.0% | β | 40.5% |
| Qwen 3 Coder Plus | 480 | 69.6% | 54.7% | 37.5% |
| Gemini 3 Pro | β | 76.2% | β | 54.2% |
| Claude Sonnet 4.5 | β | 77.2% | 68.0% | 42.8% |
| GPT 5.1 Codex Max | β | 77.9% | β | 58.1% |
| GPT 5.1 Codex High | β | 73.7% | β | 52.8% |
Environmental Impact
Environmental impact details will be released as measurements are finalized.
Technical Specifications
Model Architecture and Objective
Transformer-based large language model optimized for agentic reasoning and tool usage.
Compute Infrastructure
Hardware
Large-scale GPU/accelerator clusters
Software
Custom training and inference stack developed by Robi Labs
Model Card Authors
Robi Labs Research Team
Model Card Contact
- Downloads last month
- 3
4-bit