|
|
--- |
|
|
language: |
|
|
- en |
|
|
license: mit |
|
|
tags: |
|
|
- tinystories |
|
|
- causal-lm |
|
|
- small-lm |
|
|
- instruction-tuned |
|
|
- sft |
|
|
datasets: |
|
|
- ltg/babylm-2024-baby-cosmo-fine-10m |
|
|
- roneneldan/TinyStoriesInstruct |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# TinyStories 30M - Instruct |
|
|
|
|
|
A 30M parameter causal language model fine-tuned for instruction-following on the TinyStoriesInstruct dataset. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Parameters**: ~30M |
|
|
- **Architecture**: LLaMA-style transformer |
|
|
- **Base Model**: TinyStories 30M pre-trained |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained("Raising-an-llm/tinystories-30m-instruct") |
|
|
tokenizer = AutoTokenizer.from_pretrained("Raising-an-llm/tinystories-30m-instruct") |
|
|
|
|
|
# Format: prompt + "\n\nStory:\n" |
|
|
prompt = "Write a short story using these words: brave, forest, magical." |
|
|
formatted = prompt + "\n\nStory:\n" |
|
|
|
|
|
inputs = tokenizer(formatted, return_tensors="pt") |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=256, |
|
|
temperature=0.8, |
|
|
top_p=0.9, |
|
|
do_sample=True |
|
|
) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
## Prompt Formats |
|
|
|
|
|
The model was trained on three types of prompts: |
|
|
|
|
|
1. **Using words**: `Write a short story using these words: [word1], [word2], [word3].` |
|
|
2. **From summary**: `Write a story based on this summary: [summary]` |
|
|
3. **With features**: `Write a short story with these features: [feature1], [feature2].` |
|
|
|
|
|
Always append `\n\nStory:\n` after your prompt. |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Dataset**: TinyStoriesInstruct (~117K unique examples) |
|
|
- **Epochs**: ~4 |
|
|
- **Batch Size**: 4 |
|
|
- **Learning Rate**: 5e-5 |
|
|
- **Method**: SFT with response masking (only trained on story, not instruction) |
|
|
|
|
|
## Related Models |
|
|
|
|
|
- [Raising-an-llm/tinystories-30m-base](https://huggingface.co/Raising-an-llm/tinystories-30m-base) - Base pre-trained model |
|
|
|