Raising-an-llm
/

tinystories-30m-instruct

Text Generation

instruction-tuned

Model card Files Files and versions

tinystories-30m-instruct / README.md

MathiasLiU's picture

Update README.md

80bcb44 verified 6 days ago

|

history blame contribute delete

1.87 kB

	---
	language:
	- en
	license: mit
	tags:
	- tinystories
	- causal-lm
	- small-lm
	- instruction-tuned
	- sft
	datasets:
	- ltg/babylm-2024-baby-cosmo-fine-10m
	- roneneldan/TinyStoriesInstruct
	pipeline_tag: text-generation
	---

	# TinyStories 30M - Instruct

	A 30M parameter causal language model fine-tuned for instruction-following on the TinyStoriesInstruct dataset.

	## Model Details

	- Parameters: ~30M
	- Architecture: LLaMA-style transformer
	- Base Model: TinyStories 30M pre-trained

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("Raising-an-llm/tinystories-30m-instruct")
	tokenizer = AutoTokenizer.from_pretrained("Raising-an-llm/tinystories-30m-instruct")

	# Format: prompt + "\n\nStory:\n"
	prompt = "Write a short story using these words: brave, forest, magical."
	formatted = prompt + "\n\nStory:\n"

	inputs = tokenizer(formatted, return_tensors="pt")
	outputs = model.generate(
	**inputs,
	max_new_tokens=256,
	temperature=0.8,
	top_p=0.9,
	do_sample=True
	)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Prompt Formats

	The model was trained on three types of prompts:

	1. Using words: `Write a short story using these words: [word1], [word2], [word3].`
	2. From summary: `Write a story based on this summary: [summary]`
	3. With features: `Write a short story with these features: [feature1], [feature2].`

	Always append `\n\nStory:\n` after your prompt.

	## Training Details

	- Dataset: TinyStoriesInstruct (~117K unique examples)
	- Epochs: ~4
	- Batch Size: 4
	- Learning Rate: 5e-5
	- Method: SFT with response masking (only trained on story, not instruction)

	## Related Models

	- [Raising-an-llm/tinystories-30m-base](https://huggingface.co/Raising-an-llm/tinystories-30m-base) - Base pre-trained model