tinystories-gpt
Tiny GPT (~1.31M params, byte-level BPE vocab 4096) trained on TinyStories for an LLM workshop (chapter 1: run a finished model, then rebuild it).
- Architecture: nanoGPT (Andrej Karpathy, MIT)
- Data: TinyStories (Eldan & Li, 2023)
- Files:
ckpt.pt(model_args + weights),tokenizer.json(BPE 4096)
Generates simple children's stories ("Once upon a time, there was a little girl named Lily..."). See the workshop's chapter 1 notebook for loading/usage.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support