TinyStories 30M - Base Model

A 30M parameter causal language model pre-trained on the TinyStories dataset.

Model Details

  • Parameters: ~30M
  • Architecture: LLaMA-style transformer
  • Training Data: TinyStories dataset
  • Training: Pre-trained from scratch using Nanotron

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Raising-an-llm/tinystories-30m-base")
tokenizer = AutoTokenizer.from_pretrained("Raising-an-llm/tinystories-30m-base")

prompt = "Once upon a time"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))

Training Details

This is the base pre-trained model before instruction tuning. It was trained on the TinyStories dataset which contains simple children's stories.

Related Models

Downloads last month
42
Safetensors
Model size
26.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Raising-an-llm/tinystories-30m-base