endurasolution
/

ronmicro-llm-story

Text Generation

children-stories

Model card Files Files and versions

ronmicro-llm-story / README.md

endurasolution's picture

Upload README.md with huggingface_hub

c491095 verified 23 days ago

|

history blame contribute delete

2.15 kB

	---
	license: mit
	language:
	- en
	tags:
	- text-generation
	- gpt2
	- stories
	- children-stories
	- tinystories
	datasets:
	- roneneldan/TinyStories
	widget:
	- text: "Once upon a time"
	example_title: "Story Beginning"
	- text: "The little girl loved to"
	example_title: "Character Story"
	- text: "In a magical forest,"
	example_title: "Fantasy Setting"
	---

	# RonMicro-LLM-Story (Phase 1)

	A small GPT-2 style language model trained on TinyStories dataset for generating children's stories.

	## Model Details

	- Model Type: GPT-2 Causal Language Model
	- Parameters: ~40M
	- Training Data: TinyStories (5% subset, ~105K stories)
	- Vocabulary Size: 25,913 tokens
	- Context Length: 512 tokens
	- Training Epochs: 3
	- Language: English

	## Training Details

	- Framework: Transformers (Hugging Face)
	- Tokenizer: Custom BPE trained on TinyStories
	- Architecture:
	- 6 transformer layers
	- 384 embedding dimensions
	- 6 attention heads
	- 1536 FFN dimensions

	## Usage
	```python
	from transformers import pipeline

	# Load the model
	generator = pipeline("text-generation", model="endurasolution/ronmicro-llm-story")

	# Generate a story
	story = generator(
	"Once upon a time",
	max_new_tokens=150,
	temperature=0.7,
	repetition_penalty=1.3,
	no_repeat_ngram_size=3,
	do_sample=True
	)

	print(story[0]["generated_text"])
	```

	## Example Outputs

	Prompt: "Once upon a time"
	Output: "Once upon a time, there was a little boy named Timmy. He loved to play with his toy cars and trucks all day long..."

	## Limitations

	- Trained on only 5% of TinyStories (Phase 1)
	- May generate repetitive text occasionally
	- Best for short children's stories (100-200 words)
	- Limited to simple vocabulary and grammar

	## Next Steps

	Phase 2 training in progress with 20% data and 5 epochs for improved quality.

	## Citation

	Built using TinyStories dataset:
	```
	@article{eldan2023tinystories,
	title={TinyStories: How Small Can Language Models Be and Still Speak Coherent English?},
	author={Eldan, Ronen and Li, Yuanzhi},
	journal={arXiv preprint arXiv:2305.07759},
	year={2023}
	}
	```

	## License

	MIT License