| --- |
| license: other |
| library_name: pytorch |
| pipeline_tag: text-generation |
| tags: |
| - text-generation |
| - story-generation |
| - gpt |
| - pytorch |
| - custom-code |
| language: |
| - en |
| --- |
| |
| # TinyGPT StoryGPT |
|
|
| TinyGPT StoryGPT is a small GPT-style decoder-only transformer trained from scratch for children's story generation, with an instruction-tuned checkpoint for simple prompt-following. |
|
|
| This is a custom PyTorch model, not a Hugging Face Transformers GPT-2 checkpoint. Use the included `model.py`, `tokenizer.py`, and `storyGPT.py` files to load and run it. |
|
|
| ## Model Details |
|
|
| - Architecture: decoder-only transformer |
| - Vocabulary size: 8000 |
| - Context window: 512 tokens |
| - Embedding size: 512 |
| - Attention heads: 8 |
| - Layers: 8 |
| - Parameters: 29,541,376 |
| - Tokenizer: custom BPE tokenizer |
| - Base model file: `story_model.pth` |
| - Instruction model file: `instruct_checkpoints/instruct_model.pth` |
|
|
| The instruction checkpoint metadata reports epoch 8, loss `1.1102`, vocabulary size `8000`, and context window `512`. |
|
|
| ## Usage |
|
|
| ```bash |
| pip install -r requirements.txt |
| python storyGPT.py |
| ``` |
|
|
| For API serving: |
|
|
| ```bash |
| pip install -r requirements.txt |
| python api.py |
| ``` |
|
|
| Example request: |
|
|
| ```bash |
| curl -X POST http://localhost:8000/generate \ |
| -H "Content-Type: application/json" \ |
| -d '{"prompt":"Write a bedtime story about a brave little star.","max_tokens":120}' |
| ``` |
|
|
| ## Files |
|
|
| - `model.py` - transformer model definition |
| - `tokenizer.py` - custom BPE tokenizer |
| - `tokenizer.json` - trained tokenizer vocabulary and merges |
| - `storyGPT.py` - interactive CLI generation |
| - `api.py` - Flask API server |
| - `story_model.pth` - base story model checkpoint |
| - `instruct_checkpoints/instruct_model.pth` - instruction-tuned checkpoint |
|
|
| ## Limitations |
|
|
| This is an experimental small model. It can produce repetition, factual errors, malformed formatting, or unsafe/unwanted text. Review outputs before using them with children or public audiences. |
|
|
| ## Training Data |
|
|
| The repository contains code for base training and instruction fine-tuning. The public upload excludes large/local training corpora and intermediate checkpoints by default. |
|
|
| ## License |
|
|
| No final license has been selected in this scaffold. Choose a license only after confirming that your training data and assets are compatible with public release. |
|
|