Bleyzos Coder


Community
Telegram

Bleyzos Coder

Bleyzos Coder is an open-source Mixture-of-Experts (MoE) language model with 1.02T total parameters and 42B active parameters. Built on a fork of MiMo-V2.5-Pro, fine-tuned for coding, cybersecurity, and agentic workflows. Supports up to 1M tokens context length.

Model Details

  • Developer: Bleyzos AI (https://bleyzos.com)
  • Architecture: Mixture-of-Experts (MoE) with Hybrid Attention (SWA + GA)
  • Total Parameters: 1.02T
  • Active Parameters: 42B
  • Context Length: Up to 1M tokens
  • License: MIT

Key Features

  • Hybrid Attention: Sliding Window Attention + Global Attention (6:1 ratio), reduces KV-cache by ~7x
  • Multi-Token Prediction: 3 MTP layers for 3x faster inference
  • Long Context: Up to 1M tokens — feed entire codebases
  • Agentic: Post-trained with SFT + RL + Multi-Teacher Distillation for complex multi-step tasks
  • Security-First: Built-in filters against prompt injection and data leaks

Usage

Hugging Face Inference API

from huggingface_hub import InferenceClient

client = InferenceClient(model="Mini-Bleyz/Bleyzos-Coder")

response = client.chat_completion(
    messages=[{"role": "user", "content": "Write a Python function to reverse a linked list"}],
    max_tokens=512
)

print(response["choices"][0]["message"]["content"])

SGLang Deployment (for GPU servers)

python3 -m sglang.launch_server \
    --model-path Mini-Bleyz/Bleyzos-Coder \
    --trust-remote-code \
    --tp 8 \
    --ep 8 \
    --context-length 1048576 \
    --host 0.0.0.0 \
    --port 9001

Benchmarks

Benchmark Bleyzos Coder MiMo-V2.5-Pro
BBH (3-shot) 89.1 88.4
GSM8K (8-shot) 99.8 99.6
HumanEval+ 78.3 75.6
SWE-Bench (AgentLess) 58.7 35.7
ClawEval pass³ 65.2 63.8

Limitations

  • Requires significant GPU memory (8×A100/H100 recommended for full model)
  • GGUF quantized version available at DevQuasar/XiaomiMiMo.MiMo-V2.5-Pro-GGUF for CPU-only usage
  • System prompt customized for Bleyzos AI identity

Citation

@misc{bleyzos2026coder,
  title={Bleyzos Coder},
  author={{Bleyzos AI Team}},
  year={2026},
  howpublished={\url{https://huggingface.co/Mini-Bleyz/Bleyzos-Coder}},
}

Contact

Downloads last month
167
Safetensors
Model size
1T params
Tensor type
F32
·
BF16
·
F8_E4M3
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support