File size: 3,074 Bytes
301bdca a860957 301bdca a860957 7cb282f 301bdca 7cb282f 301bdca a860957 301bdca a860957 301bdca 7cb282f 301bdca 7cb282f 301bdca 7cb282f 301bdca 7cb282f 301bdca 7cb282f 301bdca 7cb282f 301bdca 7cb282f 301bdca 7cb282f 301bdca 7cb282f 301bdca 7cb282f 301bdca 7cb282f 301bdca 7cb282f 301bdca 7cb282f a860957 7cb282f a860957 7cb282f 301bdca 7cb282f 301bdca a860957 301bdca 7cb282f 301bdca a340080 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 | ---
license: mit
language:
- en
- ru
tags:
- text-generation
- agent
- long-context
- code
- security
- made-by-bleyzos
pipeline_tag: text-generation
---
<br/><br/>
<div align="center">
<picture>
<source srcset="https://cdn.bleyzos.ru/brand.png" media="(prefers-color-scheme: dark)">
<img src="https://cdn.bleyzos.ru/brand.png" width="60%" alt="Bleyzos Coder" />
</picture>
</div>
<br/>
<br/>
<div align="center" style="line-height: 1.2;">
<strong>Community</strong><br/>
<a href="https://t.me/bleyzos" target="_blank">Telegram</a>
</div>
<br/>
# Bleyzos Coder
**Bleyzos Coder** is an open-source Mixture-of-Experts (MoE) language model with **1.02T total parameters** and **42B active parameters**. Built on a fork of MiMo-V2.5-Pro, fine-tuned for coding, cybersecurity, and agentic workflows. Supports up to **1M tokens context length**.
## Model Details
- **Developer**: Bleyzos AI (https://bleyzos.com)
- **Architecture**: Mixture-of-Experts (MoE) with Hybrid Attention (SWA + GA)
- **Total Parameters**: 1.02T
- **Active Parameters**: 42B
- **Context Length**: Up to 1M tokens
- **License**: MIT
## Key Features
- **Hybrid Attention**: Sliding Window Attention + Global Attention (6:1 ratio), reduces KV-cache by ~7x
- **Multi-Token Prediction**: 3 MTP layers for 3x faster inference
- **Long Context**: Up to 1M tokens — feed entire codebases
- **Agentic**: Post-trained with SFT + RL + Multi-Teacher Distillation for complex multi-step tasks
- **Security-First**: Built-in filters against prompt injection and data leaks
## Usage
### Hugging Face Inference API
```python
from huggingface_hub import InferenceClient
client = InferenceClient(model="Mini-Bleyz/Bleyzos-Coder")
response = client.chat_completion(
messages=[{"role": "user", "content": "Write a Python function to reverse a linked list"}],
max_tokens=512
)
print(response["choices"][0]["message"]["content"])
```
### SGLang Deployment (for GPU servers)
```bash
python3 -m sglang.launch_server \
--model-path Mini-Bleyz/Bleyzos-Coder \
--trust-remote-code \
--tp 8 \
--ep 8 \
--context-length 1048576 \
--host 0.0.0.0 \
--port 9001
```
## Benchmarks
| Benchmark | Bleyzos Coder | MiMo-V2.5-Pro |
|-----------|---------------|---------------|
| BBH (3-shot) | 89.1 | 88.4 |
| GSM8K (8-shot) | 99.8 | 99.6 |
| HumanEval+ | 78.3 | 75.6 |
| SWE-Bench (AgentLess) | 58.7 | 35.7 |
| ClawEval pass³ | 65.2 | 63.8 |
## Limitations
- Requires significant GPU memory (8×A100/H100 recommended for full model)
- GGUF quantized version available at [DevQuasar/XiaomiMiMo.MiMo-V2.5-Pro-GGUF](https://huggingface.co/DevQuasar/XiaomiMiMo.MiMo-V2.5-Pro-GGUF) for CPU-only usage
- System prompt customized for Bleyzos AI identity
## Citation
```bibtex
@misc{bleyzos2026coder,
title={Bleyzos Coder},
author={{Bleyzos AI Team}},
year={2026},
howpublished={\url{https://huggingface.co/Mini-Bleyz/Bleyzos-Coder}},
}
```
## Contact
- **Email**: support@bleyzos.ru
- **Website**: https://ai.bleyzos.com
- **Telegram**: https://t.me/bleyzos |