README.md · Mini-Bleyz/Bleyzos-Coder at main

Bleyzos-Coder / README.md

Mini-Bleyz

Update README.md

a340080 verified 22 days ago

preview code

raw

history blame contribute delete

3.07 kB

	---
	license: mit
	language:
	- en
	- ru
	tags:
	- text-generation
	- agent
	- long-context
	- code
	- security
	- made-by-bleyzos
	pipeline_tag: text-generation
	---

	<br/><br/>

	<div align="center">
	<picture>
	<source srcset="https://cdn.bleyzos.ru/brand.png" media="(prefers-color-scheme: dark)">
	<img src="https://cdn.bleyzos.ru/brand.png" width="60%" alt="Bleyzos Coder" />
	</picture>
	</div>

	<br/>

	<br/>

	<div align="center" style="line-height: 1.2;">
	<strong>Community</strong><br/>
	<a href="https://t.me/bleyzos" target="_blank">Telegram</a>
	</div>

	<br/>

	# Bleyzos Coder

	Bleyzos Coder is an open-source Mixture-of-Experts (MoE) language model with 1.02T total parameters and 42B active parameters. Built on a fork of MiMo-V2.5-Pro, fine-tuned for coding, cybersecurity, and agentic workflows. Supports up to 1M tokens context length.

	## Model Details

	- Developer: Bleyzos AI (https://bleyzos.com)
	- Architecture: Mixture-of-Experts (MoE) with Hybrid Attention (SWA + GA)
	- Total Parameters: 1.02T
	- Active Parameters: 42B
	- Context Length: Up to 1M tokens
	- License: MIT

	## Key Features

	- Hybrid Attention: Sliding Window Attention + Global Attention (6:1 ratio), reduces KV-cache by ~7x
	- Multi-Token Prediction: 3 MTP layers for 3x faster inference
	- Long Context: Up to 1M tokens — feed entire codebases
	- Agentic: Post-trained with SFT + RL + Multi-Teacher Distillation for complex multi-step tasks
	- Security-First: Built-in filters against prompt injection and data leaks

	## Usage

	### Hugging Face Inference API

	```python
	from huggingface_hub import InferenceClient

	client = InferenceClient(model="Mini-Bleyz/Bleyzos-Coder")

	response = client.chat_completion(
	messages=[{"role": "user", "content": "Write a Python function to reverse a linked list"}],
	max_tokens=512
	)

	print(response["choices"][0]["message"]["content"])
	```

	### SGLang Deployment (for GPU servers)

	```bash
	python3 -m sglang.launch_server \
	--model-path Mini-Bleyz/Bleyzos-Coder \
	--trust-remote-code \
	--tp 8 \
	--ep 8 \
	--context-length 1048576 \
	--host 0.0.0.0 \
	--port 9001
	```

	## Benchmarks

	\| Benchmark \| Bleyzos Coder \| MiMo-V2.5-Pro \|
	\|-----------\|---------------\|---------------\|
	\| BBH (3-shot) \| 89.1 \| 88.4 \|
	\| GSM8K (8-shot) \| 99.8 \| 99.6 \|
	\| HumanEval+ \| 78.3 \| 75.6 \|
	\| SWE-Bench (AgentLess) \| 58.7 \| 35.7 \|
	\| ClawEval pass³ \| 65.2 \| 63.8 \|

	## Limitations

	- Requires significant GPU memory (8×A100/H100 recommended for full model)
	- GGUF quantized version available at [DevQuasar/XiaomiMiMo.MiMo-V2.5-Pro-GGUF](https://huggingface.co/DevQuasar/XiaomiMiMo.MiMo-V2.5-Pro-GGUF) for CPU-only usage
	- System prompt customized for Bleyzos AI identity

	## Citation

	```bibtex
	@misc{bleyzos2026coder,
	title={Bleyzos Coder},
	author={{Bleyzos AI Team}},
	year={2026},
	howpublished={\url{https://huggingface.co/Mini-Bleyz/Bleyzos-Coder}},
	}
	```

	## Contact

	- Email: support@bleyzos.ru
	- Website: https://ai.bleyzos.com
	- Telegram: https://t.me/bleyzos