samsja
/

glm4-moe-tiny

Text Generation

Mixture of Experts

Model card Files Files and versions

glm4-moe-tiny / README.md

samsja's picture

Upload folder using huggingface_hub

a0fa2d5 verified 18 days ago

|

history blame contribute delete

1.58 kB

	---
	license: apache-2.0
	tags:
	- prime-rl
	- moe
	- test-model
	library_name: transformers
	---

	<div align="center">
	<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/61e020e4a343274bb132e138/H2mcdPRWtl4iKLd-OYYBc.jpeg" width="200"/>
	</div>

	# glm4-moe-tiny

	A small (~543M parameter) GLM-4 MoE model for testing [prime-rl](https://github.com/PrimeIntellect-ai/prime-rl) integration with HuggingFace Transformers and vLLM.

	## Purpose

	This model is not for production use. It exists to:
	- Validate MoE weight conversion between HuggingFace and PrimeRL formats
	- Test the full RL training pipeline (inference server + trainer) at small scale
	- Catch architecture-specific bugs without needing 100B+ parameter models

	The model has been fine-tuned on [PrimeIntellect/Reverse-Text-SFT](https://huggingface.co/datasets/PrimeIntellect/Reverse-Text-SFT) to provide a non-trivial distribution for KL divergence during RL.

	## Quick Start

	```bash
	# Run RL with reverse-text environment
	uv run rl @ configs/ci/integration/rl_moe/glm4_moe.toml
	```

	See the [Testing MoE at Small Scale](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/docs/testing-moe-at-small-scale.md) guide for full instructions.

	## Model Details

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Hidden size \| 1024 \|
	\| Layers \| 24 \|
	\| Experts \| 8 \|
	\| Active experts \| 4 \|
	\| Parameters \| ~543M \|

	## Links

	- [prime-rl](https://github.com/PrimeIntellect-ai/prime-rl) - RL training framework
	- [PrimeIntellect](https://www.primeintellect.ai/) - Building infrastructure for decentralized AI