Trinity-337B / README.md
0xSero's picture
Standardize model card (template rollout)
613826b verified
metadata
license: mit
pipeline_tag: text-generation
library_name: transformers
tags:
  - reap
  - trinity

Support this work → · X · GitHub · REAP paper · Cerebras REAP

Trinity-337B

REAP-pruned the base model.

At a glance

Base model
Format BF16
Total params 337B
Active / token
Experts / layer 216
Layers 60
Hidden size 3072
Context 262,144
On-disk size 675 GB

Which variant should I pick?

Variant Format Link
Trinity-337B (this) BF16 link
Trinity-337B-W4A16 W4A16 link
Trinity-337B-W4A16-192 W4A16 link

License & citation

License inherited from the base model.

@misc{lasby2025reap,
  title  = {REAP the Experts: Why Pruning Prevails for One-Shot MoE Compression},
  author = {Mike Lasby and Ivan Lazarevich and Nish Sinnadurai and Sean Lie and Yani Ioannou and Vithursan Thangarasa},
  year   = {2025}, eprint = {2510.13999}, archivePrefix = {arXiv}
}

Sponsors

Made possible by NVIDIA · TNG Technology · Lambda · Prime Intellect · Hot Aisle.