RonMicro-LLM-Story (Phase 1)

A small GPT-2 style language model trained on TinyStories dataset for generating children's stories.

Model Details

  • Model Type: GPT-2 Causal Language Model
  • Parameters: ~40M
  • Training Data: TinyStories (5% subset, ~105K stories)
  • Vocabulary Size: 25,913 tokens
  • Context Length: 512 tokens
  • Training Epochs: 3
  • Language: English

Training Details

  • Framework: Transformers (Hugging Face)
  • Tokenizer: Custom BPE trained on TinyStories
  • Architecture:
    • 6 transformer layers
    • 384 embedding dimensions
    • 6 attention heads
    • 1536 FFN dimensions

Usage

from transformers import pipeline

# Load the model
generator = pipeline("text-generation", model="endurasolution/ronmicro-llm-story")

# Generate a story
story = generator(
    "Once upon a time",
    max_new_tokens=150,
    temperature=0.7,
    repetition_penalty=1.3,
    no_repeat_ngram_size=3,
    do_sample=True
)

print(story[0]["generated_text"])

Example Outputs

Prompt: "Once upon a time" Output: "Once upon a time, there was a little boy named Timmy. He loved to play with his toy cars and trucks all day long..."

Limitations

  • Trained on only 5% of TinyStories (Phase 1)
  • May generate repetitive text occasionally
  • Best for short children's stories (100-200 words)
  • Limited to simple vocabulary and grammar

Next Steps

Phase 2 training in progress with 20% data and 5 epochs for improved quality.

Citation

Built using TinyStories dataset:

@article{eldan2023tinystories,
  title={TinyStories: How Small Can Language Models Be and Still Speak Coherent English?},
  author={Eldan, Ronen and Li, Yuanzhi},
  journal={arXiv preprint arXiv:2305.07759},
  year={2023}
}

License

MIT License

Downloads last month
65
Safetensors
Model size
20.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train endurasolution/ronmicro-llm-story

Paper for endurasolution/ronmicro-llm-story