TinyStories 30M - Instruct
A 30M parameter causal language model fine-tuned for instruction-following on the TinyStoriesInstruct dataset.
Model Details
- Parameters: ~30M
- Architecture: LLaMA-style transformer
- Base Model: TinyStories 30M pre-trained
- Fine-tuning: SFT on TinyStoriesInstruct (~117K examples, ~4 epochs)
- Training Time: ~2 hours on single GPU
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Raising-an-llm/tinystories-30m-instruct")
tokenizer = AutoTokenizer.from_pretrained("Raising-an-llm/tinystories-30m-instruct")
# Format: prompt + "\n\nStory:\n"
prompt = "Write a short story using these words: brave, forest, magical."
formatted = prompt + "\n\nStory:\n"
inputs = tokenizer(formatted, return_tensors="pt")
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.8,
top_p=0.9,
do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Prompt Formats
The model was trained on three types of prompts:
- Using words:
Write a short story using these words: [word1], [word2], [word3]. - From summary:
Write a story based on this summary: [summary] - With features:
Write a short story with these features: [feature1], [feature2].
Always append \n\nStory:\n after your prompt.
Training Details
- Dataset: TinyStoriesInstruct (~117K unique examples)
- Epochs: ~4
- Batch Size: 4
- Learning Rate: 5e-5
- Method: SFT with response masking (only trained on story, not instruction)
Related Models
- Raising-an-llm/tinystories-30m-base - Base pre-trained model
- Downloads last month
- 42