| --- |
| license: mit |
| language: |
| - en |
| tags: |
| - rust |
| - burn |
| - gpt |
| - tinystories |
| - from-scratch |
| - decoder-transformer |
| pipeline_tag: text-generation |
| inference: false |
| datasets: |
| - roneneldan/TinyStories |
| --- |
| |
| # RitsuGPT |
|
|
| A small, from-scratch GPT in **pure Rust** β it trains on a single consumer GPU (an NVIDIA GeForce RTX 5060, 8 GB) and runs on your own computer. *nanoGPT, in Rust.* |
|
|
| Trainer & source code: **[github.com/NeonixLabs/RitsuGPT](https://github.com/NeonixLabs/RitsuGPT)** Β· Part of [Neonix Labs](https://labs.neonix.ai). |
|
|
| > **What it is, honestly:** a ~16.9M-parameter small language model in the spirit of [TinyStories](https://huggingface.co/datasets/roneneldan/TinyStories) (Eldan & Li, 2023). It learns to write simple, coherent short English stories. It is **not** a production assistant β no world knowledge, no reasoning, no instruction following. Its value is a clean, hackable, from-scratch stack you can train and verify yourself. |
|
|
| ## Files |
|
|
| | File | What | |
| |---|---| |
| | `ritsu-step25000.mpk` | Weights at 25,000 steps (recommended) β `burn` CompactRecorder format | |
| | `ritsu-step12000.mpk` | Weights at 12,000 steps (earlier checkpoint) | |
| | `tokenizer.json` | Byte-level BPE tokenizer (vocab 8192), HuggingFace `tokenizers` format | |
|
|
| ## Results |
|
|
| Evaluation reports **bits-per-byte (BPB)** on the TinyStories validation set β tokenizer-invariant, lower is better. |
|
|
| | Checkpoint | Steps | BPB | |
| |---|---|---| |
| | `ritsu-step12000.mpk` | 12,000 | 0.695 | |
| | `ritsu-step25000.mpk` | 25,000 | **0.6843** | |
| | byte-level baseline | β | 0.805 | |
|
|
| ## How to run |
|
|
| This is a Rust / `burn` model β not a `transformers` model β so there is no hosted inference widget. Run it locally with the trainer: |
|
|
| ```bash |
| git clone https://github.com/NeonixLabs/RitsuGPT |
| cd RitsuGPT |
| # put ritsu-step25000.mpk and tokenizer.json in this folder (download them from this repo) |
| cargo run --release --bin neonix-train -- sample ./ritsu-step25000 ./tokenizer.json "Once upon a time" 200 0.8 40 |
| ``` |
|
|
| Pass the checkpoint path **without** the `.mpk` suffix β the loader appends it. Inference runs on CPU. |
|
|
| ## Architecture |
|
|
| A standard decoder-only Transformer, optimized in Rust. |
|
|
| ## License |
|
|
| [MIT](https://github.com/NeonixLabs/RitsuGPT). Trained on the public [TinyStories](https://huggingface.co/datasets/roneneldan/TinyStories) dataset. |
|
|