---
language: en
license: mit
tags:
  - nanogpt
  - text-generation
  - character-level
  - tinystories
  - pytorch
pipeline_tag: text-generation
---

# 🧠 nanoGPT — TinyStories Character-Level Model

A compact character-level GPT model trained on the [TinyStories](https://huggingface.co/datasets/roneneldan/TinyStories) dataset using [Karpathy's nanoGPT](https://github.com/karpathy/nanoGPT).

## Model Details

| Parameter | Value |
|-----------|-------|
| Parameters | 2,723,712 |
| Layers | 6 |
| Heads | 6 |
| Embedding Dim | 192 |
| Context Length | 256 |
| Vocab Size | 93 (character-level) |
| Training Iters | 2000 |
| dtype | float16 |

## Usage

```python
import torch, json
from model import GPTConfig, GPT

# Load config
with open('config.json') as f:
    cfg = json.load(f)

# Build model
conf = GPTConfig(
    vocab_size=cfg['vocab_size'], block_size=cfg['block_size'],
    n_layer=cfg['n_layer'], n_head=cfg['n_head'], n_embd=cfg['n_embd'],
    dropout=cfg['dropout'], bias=cfg['bias']
)
model = GPT(conf)
model.load_state_dict(torch.load('pytorch_model.bin', map_location='cpu'))
model.eval()

# Tokenize & generate
from char_tokenizer import encode, decode
prompt = "Once upon a time"
ids = torch.tensor([encode(prompt)], dtype=torch.long)
out = model.generate(ids, max_new_tokens=200, temperature=0.8, top_k=40)
print(decode(out[0].tolist()))
```

## Training

Trained on Google Colab (T4 GPU) for ~10 minutes.
Dataset: First ~20 MB of TinyStories-V2-GPT4-train.

## License

MIT