ronmicro-llm-story / README.md

endurasolution

Upload README.md with huggingface_hub

c491095 verified 23 days ago

preview code

raw

history blame contribute delete

2.15 kB

metadata

license: mit
language:
  - en
tags:
  - text-generation
  - gpt2
  - stories
  - children-stories
  - tinystories
datasets:
  - roneneldan/TinyStories
widget:
  - text: Once upon a time
    example_title: Story Beginning
  - text: The little girl loved to
    example_title: Character Story
  - text: In a magical forest,
    example_title: Fantasy Setting

RonMicro-LLM-Story (Phase 1)

A small GPT-2 style language model trained on TinyStories dataset for generating children's stories.

Model Details

Model Type: GPT-2 Causal Language Model
Parameters: ~40M
Training Data: TinyStories (5% subset, ~105K stories)
Vocabulary Size: 25,913 tokens
Context Length: 512 tokens
Training Epochs: 3
Language: English

Training Details

Framework: Transformers (Hugging Face)
Tokenizer: Custom BPE trained on TinyStories
Architecture:
- 6 transformer layers
- 384 embedding dimensions
- 6 attention heads
- 1536 FFN dimensions

Usage

from transformers import pipeline

# Load the model
generator = pipeline("text-generation", model="endurasolution/ronmicro-llm-story")

# Generate a story
story = generator(
    "Once upon a time",
    max_new_tokens=150,
    temperature=0.7,
    repetition_penalty=1.3,
    no_repeat_ngram_size=3,
    do_sample=True
)

print(story[0]["generated_text"])

Example Outputs

Prompt: "Once upon a time" Output: "Once upon a time, there was a little boy named Timmy. He loved to play with his toy cars and trucks all day long..."

Limitations

Trained on only 5% of TinyStories (Phase 1)
May generate repetitive text occasionally
Best for short children's stories (100-200 words)
Limited to simple vocabulary and grammar

Next Steps

Phase 2 training in progress with 20% data and 5 epochs for improved quality.

Citation

Built using TinyStories dataset:

@article{eldan2023tinystories,
  title={TinyStories: How Small Can Language Models Be and Still Speak Coherent English?},
  author={Eldan, Ronen and Li, Yuanzhi},
  journal={arXiv preprint arXiv:2305.07759},
  year={2023}
}

License

MIT License