tinystories-gpt

Tiny GPT (~1.31M params, byte-level BPE vocab 4096) trained on TinyStories for an LLM workshop (chapter 1: run a finished model, then rebuild it).

  • Architecture: nanoGPT (Andrej Karpathy, MIT)
  • Data: TinyStories (Eldan & Li, 2023)
  • Files: ckpt.pt (model_args + weights), tokenizer.json (BPE 4096)

Generates simple children's stories ("Once upon a time, there was a little girl named Lily..."). See the workshop's chapter 1 notebook for loading/usage.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support