--- {} --- (v2 train scripts : RoPE Postitional Encoding) # SpiceeChat — Train From Scratch via Flash3 Kernels A custom TinyGPT trained from scratch using a BPE tokenizer and causal LM pipeline. ## Model Details | Property | Value | |---|---| | Architecture | TinyGPT (custom GPT-style) | | Layers | 4 | | Heads | 4 | | Hidden size | 384 | | Context length | 128 | | Vocab size | 32,768 | | Attention | Torch (T4 compatible) | ## Files - `checkpoint_step_*.pt` — model weights - `tokenizer/` — BPE tokenizer trained on the same data - `config.json` — model hyperparameters ## Load ```python import torch import json from tokenizers import Tokenizer # Load tokenizer tok = Tokenizer.from_file("tokenizer/tokenizer.json") # Load model (requires train.py in same directory) from train import TinyGPT, GPTConfig cfg = GPTConfig(vocab_size=32768, ctx_len=128, n_layer=4, n_head=4, n_embd=384, attention_backend="torch") model = TinyGPT(cfg) ckpt = torch.load("latest.pt", map_location="cpu") model.load_state_dict(ckpt["model"]) model.eval() ```