You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

TinyGPT StoryGPT

TinyGPT StoryGPT is a small GPT-style decoder-only transformer trained from scratch for children's story generation, with an instruction-tuned checkpoint for simple prompt-following.

This is a custom PyTorch model, not a Hugging Face Transformers GPT-2 checkpoint. Use the included model.py, tokenizer.py, and storyGPT.py files to load and run it.

Model Details

  • Architecture: decoder-only transformer
  • Vocabulary size: 8000
  • Context window: 512 tokens
  • Embedding size: 512
  • Attention heads: 8
  • Layers: 8
  • Parameters: 29,541,376
  • Tokenizer: custom BPE tokenizer
  • Base model file: story_model.pth
  • Instruction model file: instruct_checkpoints/instruct_model.pth

The instruction checkpoint metadata reports epoch 8, loss 1.1102, vocabulary size 8000, and context window 512.

Usage

pip install -r requirements.txt
python storyGPT.py

For API serving:

pip install -r requirements.txt
python api.py

Example request:

curl -X POST http://localhost:8000/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt":"Write a bedtime story about a brave little star.","max_tokens":120}'

Files

  • model.py - transformer model definition
  • tokenizer.py - custom BPE tokenizer
  • tokenizer.json - trained tokenizer vocabulary and merges
  • storyGPT.py - interactive CLI generation
  • api.py - Flask API server
  • story_model.pth - base story model checkpoint
  • instruct_checkpoints/instruct_model.pth - instruction-tuned checkpoint

Limitations

This is an experimental small model. It can produce repetition, factual errors, malformed formatting, or unsafe/unwanted text. Review outputs before using them with children or public audiences.

Training Data

The repository contains code for base training and instruction fine-tuning. The public upload excludes large/local training corpora and intermediate checkpoints by default.

License

No final license has been selected in this scaffold. Choose a license only after confirming that your training data and assets are compatible with public release.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support