--- license: mit language: - en tags: - text-generation - gpt2 - stories - children-stories - tinystories datasets: - roneneldan/TinyStories widget: - text: "Once upon a time" example_title: "Story Beginning" - text: "The little girl loved to" example_title: "Character Story" - text: "In a magical forest," example_title: "Fantasy Setting" --- # RonMicro-LLM-Story (Phase 1) A small GPT-2 style language model trained on TinyStories dataset for generating children's stories. ## Model Details - **Model Type:** GPT-2 Causal Language Model - **Parameters:** ~40M - **Training Data:** TinyStories (5% subset, ~105K stories) - **Vocabulary Size:** 25,913 tokens - **Context Length:** 512 tokens - **Training Epochs:** 3 - **Language:** English ## Training Details - **Framework:** Transformers (Hugging Face) - **Tokenizer:** Custom BPE trained on TinyStories - **Architecture:** - 6 transformer layers - 384 embedding dimensions - 6 attention heads - 1536 FFN dimensions ## Usage ```python from transformers import pipeline # Load the model generator = pipeline("text-generation", model="endurasolution/ronmicro-llm-story") # Generate a story story = generator( "Once upon a time", max_new_tokens=150, temperature=0.7, repetition_penalty=1.3, no_repeat_ngram_size=3, do_sample=True ) print(story[0]["generated_text"]) ``` ## Example Outputs **Prompt:** "Once upon a time" **Output:** "Once upon a time, there was a little boy named Timmy. He loved to play with his toy cars and trucks all day long..." ## Limitations - Trained on only 5% of TinyStories (Phase 1) - May generate repetitive text occasionally - Best for short children's stories (100-200 words) - Limited to simple vocabulary and grammar ## Next Steps Phase 2 training in progress with 20% data and 5 epochs for improved quality. ## Citation Built using TinyStories dataset: ``` @article{eldan2023tinystories, title={TinyStories: How Small Can Language Models Be and Still Speak Coherent English?}, author={Eldan, Ronen and Li, Yuanzhi}, journal={arXiv preprint arXiv:2305.07759}, year={2023} } ``` ## License MIT License