| --- |
| language: en |
| license: mit |
| tags: |
| - nanogpt |
| - text-generation |
| - character-level |
| - tinystories |
| - pytorch |
| pipeline_tag: text-generation |
| --- |
| |
| # 🧠 nanoGPT — TinyStories Character-Level Model |
|
|
| A compact character-level GPT model trained on the [TinyStories](https://huggingface.co/datasets/roneneldan/TinyStories) dataset using [Karpathy's nanoGPT](https://github.com/karpathy/nanoGPT). |
|
|
| ## Model Details |
|
|
| | Parameter | Value | |
| |-----------|-------| |
| | Parameters | 2,723,712 | |
| | Layers | 6 | |
| | Heads | 6 | |
| | Embedding Dim | 192 | |
| | Context Length | 256 | |
| | Vocab Size | 93 (character-level) | |
| | Training Iters | 2000 | |
| | dtype | float16 | |
|
|
| ## Usage |
|
|
| ```python |
| import torch, json |
| from model import GPTConfig, GPT |
| |
| # Load config |
| with open('config.json') as f: |
| cfg = json.load(f) |
| |
| # Build model |
| conf = GPTConfig( |
| vocab_size=cfg['vocab_size'], block_size=cfg['block_size'], |
| n_layer=cfg['n_layer'], n_head=cfg['n_head'], n_embd=cfg['n_embd'], |
| dropout=cfg['dropout'], bias=cfg['bias'] |
| ) |
| model = GPT(conf) |
| model.load_state_dict(torch.load('pytorch_model.bin', map_location='cpu')) |
| model.eval() |
| |
| # Tokenize & generate |
| from char_tokenizer import encode, decode |
| prompt = "Once upon a time" |
| ids = torch.tensor([encode(prompt)], dtype=torch.long) |
| out = model.generate(ids, max_new_tokens=200, temperature=0.8, top_k=40) |
| print(decode(out[0].tolist())) |
| ``` |
|
|
| ## Training |
|
|
| Trained on Google Colab (T4 GPU) for ~10 minutes. |
| Dataset: First ~20 MB of TinyStories-V2-GPT4-train. |
|
|
| ## License |
|
|
| MIT |
|
|