How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf RobiLabs/Bleenk:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf RobiLabs/Bleenk:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf RobiLabs/Bleenk:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf RobiLabs/Bleenk:Q4_K_M
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf RobiLabs/Bleenk:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf RobiLabs/Bleenk:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf RobiLabs/Bleenk:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf RobiLabs/Bleenk:Q4_K_M
Use Docker
docker model run hf.co/RobiLabs/Bleenk:Q4_K_M
Quick Links

Model Card for Bleenk

Model Summary

Bleenk 123B is an agentic large language model developed by Robi Labs for advanced software engineering tasks. The model is optimized for tool-driven workflows, large-scale codebase exploration, coordinated multi-file editing, and powering autonomous and semi-autonomous software engineering agents.

Bleenk is designed for long-horizon reasoning and real-world engineering environments rather than single-turn code generation.

Model Details

Model Description

  • Developed by: Robi Labs
  • Created for: Bleenk
  • Funded by: Robi Labs
  • Shared by: Robi Labs
  • Model type: Agentic Large Language Model (LLM)
  • Language(s) (NLP): Primarily English; supports multilingual code and technical text
  • License: To be released by Robi Labs
  • Finetuned from model: Proprietary pretraining and fine-tuning pipeline

Model Sources

Uses

Direct Use

  • Software engineering agents
  • AI-powered code assistants
  • Codebase navigation and analysis
  • Multi-file refactoring and maintenance
  • Tool-augmented development workflows

Downstream Use

  • Fine-tuning for organization-specific codebases
  • Integration into internal developer platforms
  • Agent frameworks for autonomous engineering

Out-of-Scope Use

  • General-purpose chat or conversational agents
  • High-risk decision-making without human oversight
  • Tasks requiring domain-specific legal, medical, or financial guarantees

Bias, Risks, and Limitations

  • The model may produce incorrect or incomplete code without verification
  • Tool misuse may result in unintended system changes
  • Performance depends on tool availability and prompt quality
  • Trained primarily on publicly available and licensed data, which may encode historical biases

Recommendations

Users should employ strong sandboxing, testing, and human-in-the-loop review when deploying Bleenk in production environments.

How to Get Started with the Model

ollama pull RobiLabs/bleenk:latest
ollama run RobiLabs/bleenk:latest

Training Details

Training Data

The model was trained on a mixture of:

  • Publicly available code repositories
  • Licensed datasets
  • Synthetic data generated for software engineering tasks

Training Procedure

Preprocessing

Data was filtered for quality, deduplicated, and normalized for code and technical text.

Training Hyperparameters

  • Training regime: Mixed-precision training (bf16)

Evaluation

Testing Data, Factors & Metrics

Testing Data

  • SWE-bench Verified
  • SWE-bench Multilingual
  • Terminal Bench

Metrics

  • Task success rate
  • Patch correctness
  • Tool execution accuracy

Results

Model Size (B Tokens) SWE Bench Verified SWE Bench Multilingual Terminal Bench
Bleenk 123 73.2% 71.3% 45.5%
Devstral 2 123 72.2% 61.3% 40.5%
Devstral Small 2 24 65.8% 51.6% 32.0%
DeepSeek v3.2 671 73.1% 70.2% 46.4%
Kimi K2 Thinking 1000 71.3% 61.1% 35.7%
MiniMax M2 230 69.4% 56.5% 30.0%
GLM 4.6 455 68.0% – 40.5%
Qwen 3 Coder Plus 480 69.6% 54.7% 37.5%
Gemini 3 Pro – 76.2% – 54.2%
Claude Sonnet 4.5 – 77.2% 68.0% 42.8%
GPT 5.1 Codex Max – 77.9% – 58.1%
GPT 5.1 Codex High – 73.7% – 52.8%

Environmental Impact

Environmental impact details will be released as measurements are finalized.

Technical Specifications

Model Architecture and Objective

Transformer-based large language model optimized for agentic reasoning and tool usage.

Compute Infrastructure

Hardware

Large-scale GPU/accelerator clusters

Software

Custom training and inference stack developed by Robi Labs

Model Card Authors

Robi Labs Research Team

Model Card Contact

hello@robiai.com

Downloads last month
3
GGUF
Model size
125B params
Architecture
mistral3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support