SPES-9B / README.md
zjr2000's picture
Update README.md
f0fda12 verified
metadata
license: apache-2.0
tags:
  - moe
  - mixture-of-experts
  - causal-lm
  - olmoe
  - distributed-training
  - decentralized-training
  - sparse-sync
language:
  - en
pipeline_tag: text-generation

SPES-9B

SPES-9B is a pretrained language model released as part of paper:

Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm

Model Details

  • Model name: SPES-9B
  • Model type: Causal language model
  • Parameters: 9B
  • Framework: SPES
  • License: Apache-2.0

Project Links

Intended Use

This model is intended for:

  • research on decentralized LLM pretraining
  • research on MoE training and synchronization
  • experimentation and evaluation of pretrained language models

Citation

If you use this model, please cite the SPES paper.

@article{zhang2026spes,
  title={Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm},
  author={Zhang, Jinrui and Xiao, Chaodong and Wu, Aoqi and Zhang, Xindong and Zhang, Lei},
  year={2026}
}