| --- |
| {} |
| --- |
| |
| (v2 train scripts : RoPE Postitional Encoding) |
| # SpiceeChat — Train From Scratch via Flash3 Kernels |
|
|
| A custom TinyGPT trained from scratch using a BPE tokenizer and causal LM pipeline. |
|
|
| ## Model Details |
|
|
| | Property | Value | |
| |---|---| |
| | Architecture | TinyGPT (custom GPT-style) | |
| | Layers | 4 | |
| | Heads | 4 | |
| | Hidden size | 384 | |
| | Context length | 128 | |
| | Vocab size | 32,768 | |
| | Attention | Torch (T4 compatible) | |
|
|
| ## Files |
|
|
| - `checkpoint_step_*.pt` — model weights |
| - `tokenizer/` — BPE tokenizer trained on the same data |
| - `config.json` — model hyperparameters |
|
|
| ## Load |
|
|
| ```python |
| import torch |
| import json |
| from tokenizers import Tokenizer |
| |
| # Load tokenizer |
| tok = Tokenizer.from_file("tokenizer/tokenizer.json") |
| |
| # Load model (requires train.py in same directory) |
| from train import TinyGPT, GPTConfig |
| cfg = GPTConfig(vocab_size=32768, ctx_len=128, n_layer=4, n_head=4, n_embd=384, attention_backend="torch") |
| model = TinyGPT(cfg) |
| ckpt = torch.load("latest.pt", map_location="cpu") |
| model.load_state_dict(ckpt["model"]) |
| model.eval() |
| ``` |