stas/openwebtext-10k
Updated • 1.97k • 32
A minimal GPT language model trained from scratch using only Python's standard library (no PyTorch/TensorFlow).
This model implements the core GPT architecture:
| Parameter | Value |
|---|---|
| Layers | 6 |
| Embedding Dimension | 192 |
| Attention Heads | 6 |
| Context Length | 256 |
| Vocabulary Size | 77 |
| Total Parameters | 112,256 |
This model uses a custom pure-Python implementation. See the repository for the complete code.
# Load and generate
from model import gpt, generate
text = generate(prompt="Once upon a time", temperature=0.7)
print(text)
Limitations
Small model size (educational purposes)
Character-level tokenization (not BPE)
Limited training data and compute
Pure Python = slow inference
License
Apache 2.0
Acknowledgments
Based on Andrej Karpathy's educational implementations (micrograd, makemore, nanoGPT).