Instructions to use RobiLabs/Bleenk with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use RobiLabs/Bleenk with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="RobiLabs/Bleenk", filename="bleenk-123b-Q4_K_M.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use RobiLabs/Bleenk with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf RobiLabs/Bleenk:Q4_K_M # Run inference directly in the terminal: llama-cli -hf RobiLabs/Bleenk:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf RobiLabs/Bleenk:Q4_K_M # Run inference directly in the terminal: llama-cli -hf RobiLabs/Bleenk:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf RobiLabs/Bleenk:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf RobiLabs/Bleenk:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf RobiLabs/Bleenk:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf RobiLabs/Bleenk:Q4_K_M
Use Docker
docker model run hf.co/RobiLabs/Bleenk:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use RobiLabs/Bleenk with Ollama:
ollama run hf.co/RobiLabs/Bleenk:Q4_K_M
- Unsloth Studio new
How to use RobiLabs/Bleenk with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for RobiLabs/Bleenk to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for RobiLabs/Bleenk to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for RobiLabs/Bleenk to start chatting
- Docker Model Runner
How to use RobiLabs/Bleenk with Docker Model Runner:
docker model run hf.co/RobiLabs/Bleenk:Q4_K_M
- Lemonade
How to use RobiLabs/Bleenk with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull RobiLabs/Bleenk:Q4_K_M
Run and chat with the model
lemonade run user.Bleenk-Q4_K_M
List all available models
lemonade list
output = llm(
"Once upon a time,",
max_tokens=512,
echo=True
)
print(output)Model Card for Bleenk
Model Summary
Bleenk 123B is an agentic large language model developed by Robi Labs for advanced software engineering tasks. The model is optimized for tool-driven workflows, large-scale codebase exploration, coordinated multi-file editing, and powering autonomous and semi-autonomous software engineering agents.
Bleenk is designed for long-horizon reasoning and real-world engineering environments rather than single-turn code generation.
Model Details
Model Description
- Developed by: Robi Labs
- Created for: Bleenk
- Funded by: Robi Labs
- Shared by: Robi Labs
- Model type: Agentic Large Language Model (LLM)
- Language(s) (NLP): Primarily English; supports multilingual code and technical text
- License: To be released by Robi Labs
- Finetuned from model: Proprietary pretraining and fine-tuning pipeline
Model Sources
- Demo: https://bleenk.app
Uses
Direct Use
- Software engineering agents
- AI-powered code assistants
- Codebase navigation and analysis
- Multi-file refactoring and maintenance
- Tool-augmented development workflows
Downstream Use
- Fine-tuning for organization-specific codebases
- Integration into internal developer platforms
- Agent frameworks for autonomous engineering
Out-of-Scope Use
- General-purpose chat or conversational agents
- High-risk decision-making without human oversight
- Tasks requiring domain-specific legal, medical, or financial guarantees
Bias, Risks, and Limitations
- The model may produce incorrect or incomplete code without verification
- Tool misuse may result in unintended system changes
- Performance depends on tool availability and prompt quality
- Trained primarily on publicly available and licensed data, which may encode historical biases
Recommendations
Users should employ strong sandboxing, testing, and human-in-the-loop review when deploying Bleenk in production environments.
How to Get Started with the Model
ollama pull RobiLabs/bleenk:latest
ollama run RobiLabs/bleenk:latest
Training Details
Training Data
The model was trained on a mixture of:
- Publicly available code repositories
- Licensed datasets
- Synthetic data generated for software engineering tasks
Training Procedure
Preprocessing
Data was filtered for quality, deduplicated, and normalized for code and technical text.
Training Hyperparameters
- Training regime: Mixed-precision training (bf16)
Evaluation
Testing Data, Factors & Metrics
Testing Data
- SWE-bench Verified
- SWE-bench Multilingual
- Terminal Bench
Metrics
- Task success rate
- Patch correctness
- Tool execution accuracy
Results
| Model | Size (B Tokens) | SWE Bench Verified | SWE Bench Multilingual | Terminal Bench |
|---|---|---|---|---|
| Bleenk | 123 | 73.2% | 71.3% | 45.5% |
| Devstral 2 | 123 | 72.2% | 61.3% | 40.5% |
| Devstral Small 2 | 24 | 65.8% | 51.6% | 32.0% |
| DeepSeek v3.2 | 671 | 73.1% | 70.2% | 46.4% |
| Kimi K2 Thinking | 1000 | 71.3% | 61.1% | 35.7% |
| MiniMax M2 | 230 | 69.4% | 56.5% | 30.0% |
| GLM 4.6 | 455 | 68.0% | β | 40.5% |
| Qwen 3 Coder Plus | 480 | 69.6% | 54.7% | 37.5% |
| Gemini 3 Pro | β | 76.2% | β | 54.2% |
| Claude Sonnet 4.5 | β | 77.2% | 68.0% | 42.8% |
| GPT 5.1 Codex Max | β | 77.9% | β | 58.1% |
| GPT 5.1 Codex High | β | 73.7% | β | 52.8% |
Environmental Impact
Environmental impact details will be released as measurements are finalized.
Technical Specifications
Model Architecture and Objective
Transformer-based large language model optimized for agentic reasoning and tool usage.
Compute Infrastructure
Hardware
Large-scale GPU/accelerator clusters
Software
Custom training and inference stack developed by Robi Labs
Model Card Authors
Robi Labs Research Team
Model Card Contact
- Downloads last month
- 3
4-bit
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="RobiLabs/Bleenk", filename="bleenk-123b-Q4_K_M.gguf", )