SPES-9B / README.md
zjr2000's picture
Update README.md
f0fda12 verified
---
license: apache-2.0
tags:
- moe
- mixture-of-experts
- causal-lm
- olmoe
- distributed-training
- decentralized-training
- sparse-sync
language:
- en
pipeline_tag: text-generation
---
# SPES-9B
SPES-9B is a pretrained language model released as part of paper:
**Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm**
## Model Details
- **Model name:** SPES-9B
- **Model type:** Causal language model
- **Parameters:** 9B
- **Framework:** SPES
- **License:** Apache-2.0
## Project Links
- **GitHub:** https://github.com/zjr2000/SPES
- **Paper:** https://huggingface.co/papers/2602.11543
## Intended Use
This model is intended for:
- research on decentralized LLM pretraining
- research on MoE training and synchronization
- experimentation and evaluation of pretrained language models
## Citation
If you use this model, please cite the SPES paper.
```bibtex
@article{zhang2026spes,
title={Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm},
author={Zhang, Jinrui and Xiao, Chaodong and Wu, Aoqi and Zhang, Xindong and Zhang, Lei},
year={2026}
}