README.md · Mini-Bleyz/Bleyzos-Coder at main

File size: 3,074 Bytes

301bdca
 
 
 
a860957
301bdca
 
 
 
 
a860957
 
7cb282f
301bdca
 
 
 
 
 
7cb282f
 
301bdca
 
 
 
 
 
 
 
 
a860957
301bdca
 
 
 
a860957
301bdca
7cb282f
301bdca
7cb282f
301bdca
7cb282f
 
 
 
 
 
301bdca
7cb282f
301bdca
7cb282f
 
 
 
 
301bdca
7cb282f
301bdca
7cb282f
301bdca
7cb282f
 
301bdca
7cb282f
301bdca
7cb282f
 
 
 
301bdca
7cb282f
 
301bdca
7cb282f
301bdca
 
 
7cb282f
a860957
7cb282f
 
a860957
 
7cb282f
301bdca
 
7cb282f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
301bdca
 
 
 
a860957
 
 
301bdca
7cb282f
301bdca
 
 
 
 
a340080

---
license: mit
language:
- en
- ru
tags:
- text-generation
- agent
- long-context
- code
- security
- made-by-bleyzos
pipeline_tag: text-generation
---

<br/><br/>

<div align="center">
  <picture>
    <source srcset="https://cdn.bleyzos.ru/brand.png" media="(prefers-color-scheme: dark)">
    <img src="https://cdn.bleyzos.ru/brand.png" width="60%" alt="Bleyzos Coder" />
  </picture>
</div>

<br/>

<br/>

<div align="center" style="line-height: 1.2;">
  <strong>Community</strong><br/>
  <a href="https://t.me/bleyzos" target="_blank">Telegram</a>
</div>

<br/>

# Bleyzos Coder

**Bleyzos Coder** is an open-source Mixture-of-Experts (MoE) language model with **1.02T total parameters** and **42B active parameters**. Built on a fork of MiMo-V2.5-Pro, fine-tuned for coding, cybersecurity, and agentic workflows. Supports up to **1M tokens context length**.

## Model Details

- **Developer**: Bleyzos AI (https://bleyzos.com)
- **Architecture**: Mixture-of-Experts (MoE) with Hybrid Attention (SWA + GA)
- **Total Parameters**: 1.02T
- **Active Parameters**: 42B
- **Context Length**: Up to 1M tokens
- **License**: MIT

## Key Features

- **Hybrid Attention**: Sliding Window Attention + Global Attention (6:1 ratio), reduces KV-cache by ~7x
- **Multi-Token Prediction**: 3 MTP layers for 3x faster inference
- **Long Context**: Up to 1M tokens — feed entire codebases
- **Agentic**: Post-trained with SFT + RL + Multi-Teacher Distillation for complex multi-step tasks
- **Security-First**: Built-in filters against prompt injection and data leaks

## Usage

### Hugging Face Inference API

```python
from huggingface_hub import InferenceClient

client = InferenceClient(model="Mini-Bleyz/Bleyzos-Coder")

response = client.chat_completion(
    messages=[{"role": "user", "content": "Write a Python function to reverse a linked list"}],
    max_tokens=512
)

print(response["choices"][0]["message"]["content"])
```

### SGLang Deployment (for GPU servers)

```bash
python3 -m sglang.launch_server \
    --model-path Mini-Bleyz/Bleyzos-Coder \
    --trust-remote-code \
    --tp 8 \
    --ep 8 \
    --context-length 1048576 \
    --host 0.0.0.0 \
    --port 9001
```

## Benchmarks

| Benchmark | Bleyzos Coder | MiMo-V2.5-Pro |
|-----------|---------------|---------------|
| BBH (3-shot) | 89.1 | 88.4 |
| GSM8K (8-shot) | 99.8 | 99.6 |
| HumanEval+ | 78.3 | 75.6 |
| SWE-Bench (AgentLess) | 58.7 | 35.7 |
| ClawEval pass³ | 65.2 | 63.8 |

## Limitations

- Requires significant GPU memory (8×A100/H100 recommended for full model)
- GGUF quantized version available at [DevQuasar/XiaomiMiMo.MiMo-V2.5-Pro-GGUF](https://huggingface.co/DevQuasar/XiaomiMiMo.MiMo-V2.5-Pro-GGUF) for CPU-only usage
- System prompt customized for Bleyzos AI identity

## Citation

```bibtex
@misc{bleyzos2026coder,
  title={Bleyzos Coder},
  author={{Bleyzos AI Team}},
  year={2026},
  howpublished={\url{https://huggingface.co/Mini-Bleyz/Bleyzos-Coder}},
}
```

## Contact

- **Email**: support@bleyzos.ru
- **Website**: https://ai.bleyzos.com
- **Telegram**: https://t.me/bleyzos