wasifali1
/

tinyGPT

Text Generation

story-generation

Model card Files Files and versions

tinyGPT / README.md

wasifali1's picture

Upload README.md

87ee0ef verified 5 days ago

|

History Blame Contribute Delete

2.28 kB

	---
	license: other
	library_name: pytorch
	pipeline_tag: text-generation
	tags:
	- text-generation
	- story-generation
	- gpt
	- pytorch
	- custom-code
	language:
	- en
	---

	# TinyGPT StoryGPT

	TinyGPT StoryGPT is a small GPT-style decoder-only transformer trained from scratch for children's story generation, with an instruction-tuned checkpoint for simple prompt-following.

	This is a custom PyTorch model, not a Hugging Face Transformers GPT-2 checkpoint. Use the included `model.py`, `tokenizer.py`, and `storyGPT.py` files to load and run it.

	## Model Details

	- Architecture: decoder-only transformer
	- Vocabulary size: 8000
	- Context window: 512 tokens
	- Embedding size: 512
	- Attention heads: 8
	- Layers: 8
	- Parameters: 29,541,376
	- Tokenizer: custom BPE tokenizer
	- Base model file: `story_model.pth`
	- Instruction model file: `instruct_checkpoints/instruct_model.pth`

	The instruction checkpoint metadata reports epoch 8, loss `1.1102`, vocabulary size `8000`, and context window `512`.

	## Usage

	```bash
	pip install -r requirements.txt
	python storyGPT.py
	```

	For API serving:

	```bash
	pip install -r requirements.txt
	python api.py
	```

	Example request:

	```bash
	curl -X POST http://localhost:8000/generate \
	-H "Content-Type: application/json" \
	-d '{"prompt":"Write a bedtime story about a brave little star.","max_tokens":120}'
	```

	## Files

	- `model.py` - transformer model definition
	- `tokenizer.py` - custom BPE tokenizer
	- `tokenizer.json` - trained tokenizer vocabulary and merges
	- `storyGPT.py` - interactive CLI generation
	- `api.py` - Flask API server
	- `story_model.pth` - base story model checkpoint
	- `instruct_checkpoints/instruct_model.pth` - instruction-tuned checkpoint

	## Limitations

	This is an experimental small model. It can produce repetition, factual errors, malformed formatting, or unsafe/unwanted text. Review outputs before using them with children or public audiences.

	## Training Data

	The repository contains code for base training and instruction fine-tuning. The public upload excludes large/local training corpora and intermediate checkpoints by default.

	## License

	No final license has been selected in this scaffold. Choose a license only after confirming that your training data and assets are compatible with public release.