File size: 1,166 Bytes
f0fda12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
license: apache-2.0
tags:
- moe
- mixture-of-experts
- causal-lm
- olmoe
- distributed-training
- decentralized-training
- sparse-sync
language:
- en
pipeline_tag: text-generation
---

# SPES-9B

SPES-9B is a pretrained language model released as part of paper:

**Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm**

## Model Details

- **Model name:** SPES-9B
- **Model type:** Causal language model
- **Parameters:** 9B
- **Framework:** SPES
- **License:** Apache-2.0

## Project Links

- **GitHub:** https://github.com/zjr2000/SPES
- **Paper:** https://huggingface.co/papers/2602.11543

## Intended Use

This model is intended for:

- research on decentralized LLM pretraining
- research on MoE training and synchronization
- experimentation and evaluation of pretrained language models


## Citation

If you use this model, please cite the SPES paper.

```bibtex
@article{zhang2026spes,
  title={Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm},
  author={Zhang, Jinrui and Xiao, Chaodong and Wu, Aoqi and Zhang, Xindong and Zhang, Lei},
  year={2026}
}