---
license: apache-2.0
tags:
- moe
- mixture-of-experts
- causal-lm
- olmoe
- distributed-training
- decentralized-training
- sparse-sync
language:
- en
pipeline_tag: text-generation
---

# SPES-9B

SPES-9B is a pretrained language model released as part of paper:

**Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm**

## Model Details

- **Model name:** SPES-9B
- **Model type:** Causal language model
- **Parameters:** 9B
- **Framework:** SPES
- **License:** Apache-2.0

## Project Links

- **GitHub:** https://github.com/zjr2000/SPES
- **Paper:** https://huggingface.co/papers/2602.11543

## Intended Use

This model is intended for:

- research on decentralized LLM pretraining
- research on MoE training and synchronization
- experimentation and evaluation of pretrained language models


## Citation

If you use this model, please cite the SPES paper.

```bibtex
@article{zhang2026spes,
  title={Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm},
  author={Zhang, Jinrui and Xiao, Chaodong and Wu, Aoqi and Zhang, Xindong and Zhang, Lei},
  year={2026}
}