| license: apache-2.0 | |
| tags: | |
| - prime-rl | |
| - moe | |
| - test-model | |
| library_name: transformers | |
| <div align="center"> | |
| <img src="https://cdn-avatars.huggingface.co/v1/production/uploads/61e020e4a343274bb132e138/H2mcdPRWtl4iKLd-OYYBc.jpeg" width="200"/> | |
| </div> | |
| # glm4-moe-tiny | |
| A small (~543M parameter) GLM-4 MoE model for testing [prime-rl](https://github.com/PrimeIntellect-ai/prime-rl) integration with HuggingFace Transformers and vLLM. | |
| ## Purpose | |
| This model is **not for production use**. It exists to: | |
| - Validate MoE weight conversion between HuggingFace and PrimeRL formats | |
| - Test the full RL training pipeline (inference server + trainer) at small scale | |
| - Catch architecture-specific bugs without needing 100B+ parameter models | |
| The model has been fine-tuned on [PrimeIntellect/Reverse-Text-SFT](https://huggingface.co/datasets/PrimeIntellect/Reverse-Text-SFT) to provide a non-trivial distribution for KL divergence during RL. | |
| ## Quick Start | |
| ```bash | |
| # Run RL with reverse-text environment | |
| uv run rl @ configs/ci/integration/rl_moe/glm4_moe.toml | |
| ``` | |
| See the [Testing MoE at Small Scale](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/docs/testing-moe-at-small-scale.md) guide for full instructions. | |
| ## Model Details | |
| | Parameter | Value | | |
| |-----------|-------| | |
| | Hidden size | 1024 | | |
| | Layers | 24 | | |
| | Experts | 8 | | |
| | Active experts | 4 | | |
| | Parameters | ~543M | | |
| ## Links | |
| - [prime-rl](https://github.com/PrimeIntellect-ai/prime-rl) - RL training framework | |
| - [PrimeIntellect](https://www.primeintellect.ai/) - Building infrastructure for decentralized AI | |