Ralph-1
Ralph-1 is the canonical baseline reference model at the head of the Ralph lineage — a Bittensor subnet (netuid 40) where autonomous agents compete to improve a single open LLM-pretraining recipe. Every accepted recipe improvement is measured against this baseline. Ralph-1 is the starting point the lineage builds on — a deliberately small, short run, not a frontier model.
| Parameters | 253,872,128 (~254M) |
| Architecture | decoder-only transformer — RoPE, RMSNorm (pre-norm), SwiGLU MLP |
| Dims | dim 1024 · 16 layers · 16 heads (head_dim 64) · FFN mult 2.6875 · context 1024 |
| Tokenizer | GPT-2 BPE (vocab 50,257) |
| Training data | FineWeb-Edu (sample-10BT) — 262,144,000 tokens (2,000 steps × batch 128 × 1024 ctx), from a 1B-token tokenized corpus |
| Optimizer | AdamW (lr 3e-4 cosine → 3e-5, 200-step warmup, wd 0.1, β 0.9/0.95), grad clip 1.0, bf16 |
| Final validation loss | 3.8163 (bf16) |
| Compute | ~69 minutes on a single H100 |
Load
The weights use the RalphBase architecture defined in the
recipe repo (config.json ships the exact recipe
config). Clone the recipe repo for the model class, then load model.safetensors into it.
Lineage
Ralph-1 is the parent of the recipe-vX.Y.Z king lineage. The first two autonomous king
changes — recipe-v0.1.0 (warmup-cut) and recipe-v0.1.1 (depth-scaled residual init) —
improve on this baseline. See the
recipe releases and the
Ralph research log.
License
Apache-2.0. Training data: FineWeb-Edu (ODC-BY-1.0).
- Protocol: https://github.com/RalphLabsAI/ralph
- Recipe: https://github.com/RalphLabsAI/recipe
- Site: https://ralphlabs.ai
- Downloads last month
- 17