--- license: mit language: - en - ru tags: - text-generation - agent - long-context - code - security - made-by-bleyzos pipeline_tag: text-generation ---

Bleyzos Coder


Community
Telegram

# Bleyzos Coder **Bleyzos Coder** is an open-source Mixture-of-Experts (MoE) language model with **1.02T total parameters** and **42B active parameters**. Built on a fork of MiMo-V2.5-Pro, fine-tuned for coding, cybersecurity, and agentic workflows. Supports up to **1M tokens context length**. ## Model Details - **Developer**: Bleyzos AI (https://bleyzos.com) - **Architecture**: Mixture-of-Experts (MoE) with Hybrid Attention (SWA + GA) - **Total Parameters**: 1.02T - **Active Parameters**: 42B - **Context Length**: Up to 1M tokens - **License**: MIT ## Key Features - **Hybrid Attention**: Sliding Window Attention + Global Attention (6:1 ratio), reduces KV-cache by ~7x - **Multi-Token Prediction**: 3 MTP layers for 3x faster inference - **Long Context**: Up to 1M tokens — feed entire codebases - **Agentic**: Post-trained with SFT + RL + Multi-Teacher Distillation for complex multi-step tasks - **Security-First**: Built-in filters against prompt injection and data leaks ## Usage ### Hugging Face Inference API ```python from huggingface_hub import InferenceClient client = InferenceClient(model="Mini-Bleyz/Bleyzos-Coder") response = client.chat_completion( messages=[{"role": "user", "content": "Write a Python function to reverse a linked list"}], max_tokens=512 ) print(response["choices"][0]["message"]["content"]) ``` ### SGLang Deployment (for GPU servers) ```bash python3 -m sglang.launch_server \ --model-path Mini-Bleyz/Bleyzos-Coder \ --trust-remote-code \ --tp 8 \ --ep 8 \ --context-length 1048576 \ --host 0.0.0.0 \ --port 9001 ``` ## Benchmarks | Benchmark | Bleyzos Coder | MiMo-V2.5-Pro | |-----------|---------------|---------------| | BBH (3-shot) | 89.1 | 88.4 | | GSM8K (8-shot) | 99.8 | 99.6 | | HumanEval+ | 78.3 | 75.6 | | SWE-Bench (AgentLess) | 58.7 | 35.7 | | ClawEval pass³ | 65.2 | 63.8 | ## Limitations - Requires significant GPU memory (8×A100/H100 recommended for full model) - GGUF quantized version available at [DevQuasar/XiaomiMiMo.MiMo-V2.5-Pro-GGUF](https://huggingface.co/DevQuasar/XiaomiMiMo.MiMo-V2.5-Pro-GGUF) for CPU-only usage - System prompt customized for Bleyzos AI identity ## Citation ```bibtex @misc{bleyzos2026coder, title={Bleyzos Coder}, author={{Bleyzos AI Team}}, year={2026}, howpublished={\url{https://huggingface.co/Mini-Bleyz/Bleyzos-Coder}}, } ``` ## Contact - **Email**: support@bleyzos.ru - **Website**: https://ai.bleyzos.com - **Telegram**: https://t.me/bleyzos