TinyStories 30M - Base Model
A 30M parameter causal language model pre-trained on the TinyStories dataset.
Model Details
- Parameters: ~30M
- Architecture: LLaMA-style transformer
- Training Data: TinyStories dataset
- Training: Pre-trained from scratch using Nanotron
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Raising-an-llm/tinystories-30m-base")
tokenizer = AutoTokenizer.from_pretrained("Raising-an-llm/tinystories-30m-base")
prompt = "Once upon a time"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))
Training Details
This is the base pre-trained model before instruction tuning. It was trained on the TinyStories dataset which contains simple children's stories.
Related Models
- Raising-an-llm/tinystories-30m-instruct - Instruction-tuned version
- Downloads last month
- 42