license: mit
language:
- en
tags:
- rust
- burn
- gpt
- tinystories
- from-scratch
- decoder-transformer
pipeline_tag: text-generation
inference: false
datasets:
- roneneldan/TinyStories
RitsuGPT
A small, from-scratch GPT in pure Rust — it trains on a single consumer GPU (an NVIDIA GeForce RTX 5060, 8 GB) and runs on your own computer. nanoGPT, in Rust.
Trainer & source code: github.com/NeonixLabs/RitsuGPT · Part of Neonix Labs.
What it is, honestly: a ~16.9M-parameter small language model in the spirit of TinyStories (Eldan & Li, 2023). It learns to write simple, coherent short English stories. It is not a production assistant — no world knowledge, no reasoning, no instruction following. Its value is a clean, hackable, from-scratch stack you can train and verify yourself.
Files
| File | What |
|---|---|
ritsu-step25000.mpk |
Weights at 25,000 steps (recommended) — burn CompactRecorder format |
ritsu-step12000.mpk |
Weights at 12,000 steps (earlier checkpoint) |
tokenizer.json |
Byte-level BPE tokenizer (vocab 8192), HuggingFace tokenizers format |
Results
Evaluation reports bits-per-byte (BPB) on the TinyStories validation set — tokenizer-invariant, lower is better.
| Checkpoint | Steps | BPB |
|---|---|---|
ritsu-step12000.mpk |
12,000 | 0.695 |
ritsu-step25000.mpk |
25,000 | 0.6843 |
| byte-level baseline | — | 0.805 |
How to run
This is a Rust / burn model — not a transformers model — so there is no hosted inference widget. Run it locally with the trainer:
git clone https://github.com/NeonixLabs/RitsuGPT
cd RitsuGPT
# put ritsu-step25000.mpk and tokenizer.json in this folder (download them from this repo)
cargo run --release --bin neonix-train -- sample ./ritsu-step25000 ./tokenizer.json "Once upon a time" 200 0.8 40
Pass the checkpoint path without the .mpk suffix — the loader appends it. Inference runs on CPU.
Architecture
A standard decoder-only Transformer, optimized in Rust.
License
MIT. Trained on the public TinyStories dataset.