glm4-moe-tiny / README.md
samsja's picture
Upload folder using huggingface_hub
a0fa2d5 verified
---
license: apache-2.0
tags:
- prime-rl
- moe
- test-model
library_name: transformers
---
<div align="center">
<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/61e020e4a343274bb132e138/H2mcdPRWtl4iKLd-OYYBc.jpeg" width="200"/>
</div>
# glm4-moe-tiny
A small (~543M parameter) GLM-4 MoE model for testing [prime-rl](https://github.com/PrimeIntellect-ai/prime-rl) integration with HuggingFace Transformers and vLLM.
## Purpose
This model is **not for production use**. It exists to:
- Validate MoE weight conversion between HuggingFace and PrimeRL formats
- Test the full RL training pipeline (inference server + trainer) at small scale
- Catch architecture-specific bugs without needing 100B+ parameter models
The model has been fine-tuned on [PrimeIntellect/Reverse-Text-SFT](https://huggingface.co/datasets/PrimeIntellect/Reverse-Text-SFT) to provide a non-trivial distribution for KL divergence during RL.
## Quick Start
```bash
# Run RL with reverse-text environment
uv run rl @ configs/ci/integration/rl_moe/glm4_moe.toml
```
See the [Testing MoE at Small Scale](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/docs/testing-moe-at-small-scale.md) guide for full instructions.
## Model Details
| Parameter | Value |
|-----------|-------|
| Hidden size | 1024 |
| Layers | 24 |
| Experts | 8 |
| Active experts | 4 |
| Parameters | ~543M |
## Links
- [prime-rl](https://github.com/PrimeIntellect-ai/prime-rl) - RL training framework
- [PrimeIntellect](https://www.primeintellect.ai/) - Building infrastructure for decentralized AI