TinyGPT StoryGPT
TinyGPT StoryGPT is a small GPT-style decoder-only transformer trained from scratch for children's story generation, with an instruction-tuned checkpoint for simple prompt-following.
This is a custom PyTorch model, not a Hugging Face Transformers GPT-2 checkpoint. Use the included model.py, tokenizer.py, and storyGPT.py files to load and run it.
Model Details
- Architecture: decoder-only transformer
- Vocabulary size: 8000
- Context window: 512 tokens
- Embedding size: 512
- Attention heads: 8
- Layers: 8
- Parameters: 29,541,376
- Tokenizer: custom BPE tokenizer
- Base model file:
story_model.pth - Instruction model file:
instruct_checkpoints/instruct_model.pth
The instruction checkpoint metadata reports epoch 8, loss 1.1102, vocabulary size 8000, and context window 512.
Usage
pip install -r requirements.txt
python storyGPT.py
For API serving:
pip install -r requirements.txt
python api.py
Example request:
curl -X POST http://localhost:8000/generate \
-H "Content-Type: application/json" \
-d '{"prompt":"Write a bedtime story about a brave little star.","max_tokens":120}'
Files
model.py- transformer model definitiontokenizer.py- custom BPE tokenizertokenizer.json- trained tokenizer vocabulary and mergesstoryGPT.py- interactive CLI generationapi.py- Flask API serverstory_model.pth- base story model checkpointinstruct_checkpoints/instruct_model.pth- instruction-tuned checkpoint
Limitations
This is an experimental small model. It can produce repetition, factual errors, malformed formatting, or unsafe/unwanted text. Review outputs before using them with children or public audiences.
Training Data
The repository contains code for base training and instruction fine-tuning. The public upload excludes large/local training corpora and intermediate checkpoints by default.
License
No final license has been selected in this scaffold. Choose a license only after confirming that your training data and assets are compatible with public release.