ronmicro-llm-story / README.md
endurasolution's picture
Upload README.md with huggingface_hub
c491095 verified
---
license: mit
language:
- en
tags:
- text-generation
- gpt2
- stories
- children-stories
- tinystories
datasets:
- roneneldan/TinyStories
widget:
- text: "Once upon a time"
example_title: "Story Beginning"
- text: "The little girl loved to"
example_title: "Character Story"
- text: "In a magical forest,"
example_title: "Fantasy Setting"
---
# RonMicro-LLM-Story (Phase 1)
A small GPT-2 style language model trained on TinyStories dataset for generating children's stories.
## Model Details
- **Model Type:** GPT-2 Causal Language Model
- **Parameters:** ~40M
- **Training Data:** TinyStories (5% subset, ~105K stories)
- **Vocabulary Size:** 25,913 tokens
- **Context Length:** 512 tokens
- **Training Epochs:** 3
- **Language:** English
## Training Details
- **Framework:** Transformers (Hugging Face)
- **Tokenizer:** Custom BPE trained on TinyStories
- **Architecture:**
- 6 transformer layers
- 384 embedding dimensions
- 6 attention heads
- 1536 FFN dimensions
## Usage
```python
from transformers import pipeline
# Load the model
generator = pipeline("text-generation", model="endurasolution/ronmicro-llm-story")
# Generate a story
story = generator(
"Once upon a time",
max_new_tokens=150,
temperature=0.7,
repetition_penalty=1.3,
no_repeat_ngram_size=3,
do_sample=True
)
print(story[0]["generated_text"])
```
## Example Outputs
**Prompt:** "Once upon a time"
**Output:** "Once upon a time, there was a little boy named Timmy. He loved to play with his toy cars and trucks all day long..."
## Limitations
- Trained on only 5% of TinyStories (Phase 1)
- May generate repetitive text occasionally
- Best for short children's stories (100-200 words)
- Limited to simple vocabulary and grammar
## Next Steps
Phase 2 training in progress with 20% data and 5 epochs for improved quality.
## Citation
Built using TinyStories dataset:
```
@article{eldan2023tinystories,
title={TinyStories: How Small Can Language Models Be and Still Speak Coherent English?},
author={Eldan, Ronen and Li, Yuanzhi},
journal={arXiv preprint arXiv:2305.07759},
year={2023}
}
```
## License
MIT License