--- language: en license: mit tags: - nanogpt - text-generation - character-level - tinystories - pytorch pipeline_tag: text-generation --- # 🧠 nanoGPT — TinyStories Character-Level Model A compact character-level GPT model trained on the [TinyStories](https://huggingface.co/datasets/roneneldan/TinyStories) dataset using [Karpathy's nanoGPT](https://github.com/karpathy/nanoGPT). ## Model Details | Parameter | Value | |-----------|-------| | Parameters | 2,723,712 | | Layers | 6 | | Heads | 6 | | Embedding Dim | 192 | | Context Length | 256 | | Vocab Size | 93 (character-level) | | Training Iters | 2000 | | dtype | float16 | ## Usage ```python import torch, json from model import GPTConfig, GPT # Load config with open('config.json') as f: cfg = json.load(f) # Build model conf = GPTConfig( vocab_size=cfg['vocab_size'], block_size=cfg['block_size'], n_layer=cfg['n_layer'], n_head=cfg['n_head'], n_embd=cfg['n_embd'], dropout=cfg['dropout'], bias=cfg['bias'] ) model = GPT(conf) model.load_state_dict(torch.load('pytorch_model.bin', map_location='cpu')) model.eval() # Tokenize & generate from char_tokenizer import encode, decode prompt = "Once upon a time" ids = torch.tensor([encode(prompt)], dtype=torch.long) out = model.generate(ids, max_new_tokens=200, temperature=0.8, top_k=40) print(decode(out[0].tolist())) ``` ## Training Trained on Google Colab (T4 GPU) for ~10 minutes. Dataset: First ~20 MB of TinyStories-V2-GPT4-train. ## License MIT