MathiasLiU's picture
Update README.md
fa683c9 verified
metadata
language:
  - en
license: mit
tags:
  - tinystories
  - causal-lm
  - small-lm
datasets:
  - ltg/babylm-2024-baby-cosmo-fine-10m
pipeline_tag: text-generation

TinyStories 30M - Base Model

A 30M parameter causal language model pre-trained on the ltg/babylm-2024-baby-cosmo-fine-10m

Model Details

  • Parameters: ~30M
  • Architecture: LLaMA-style transformer
  • Training Data: BabyLM
  • Training: Pre-trained from scratch using Nanotron

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Raising-an-llm/tinystories-30m-base")
tokenizer = AutoTokenizer.from_pretrained("Raising-an-llm/tinystories-30m-base")

prompt = "Once upon a time"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))

Training Details

This is the base pre-trained model before instruction tuning. It was trained on the ltg/babylm-2024-baby-cosmo-fine-10m dataset which contains simple children's phrases

Related Models