TinyStories 30M - Instruct

A 30M parameter causal language model fine-tuned for instruction-following on the TinyStoriesInstruct dataset.

Model Details

  • Parameters: ~30M
  • Architecture: LLaMA-style transformer
  • Base Model: TinyStories 30M pre-trained
  • Fine-tuning: SFT on TinyStoriesInstruct (~117K examples, ~4 epochs)
  • Training Time: ~2 hours on single GPU

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Raising-an-llm/tinystories-30m-instruct")
tokenizer = AutoTokenizer.from_pretrained("Raising-an-llm/tinystories-30m-instruct")

# Format: prompt + "\n\nStory:\n"
prompt = "Write a short story using these words: brave, forest, magical."
formatted = prompt + "\n\nStory:\n"

inputs = tokenizer(formatted, return_tensors="pt")
outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.8,
    top_p=0.9,
    do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Prompt Formats

The model was trained on three types of prompts:

  1. Using words: Write a short story using these words: [word1], [word2], [word3].
  2. From summary: Write a story based on this summary: [summary]
  3. With features: Write a short story with these features: [feature1], [feature2].

Always append \n\nStory:\n after your prompt.

Training Details

  • Dataset: TinyStoriesInstruct (~117K unique examples)
  • Epochs: ~4
  • Batch Size: 4
  • Learning Rate: 5e-5
  • Method: SFT with response masking (only trained on story, not instruction)

Related Models

Downloads last month
42
Safetensors
Model size
26.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Raising-an-llm/tinystories-30m-instruct