PicoLM
Collection
The PicoLM family made by Tralalabs
• 2 items • Updated
A 19M parameter GPT-2 style causal language model pretrained from scratch on a mix of TinyStories and FineWeb web data. Trained in ~45 minutes on a single NVIDIA T4 GPU.
| Property | Value |
|---|---|
| Architecture | GPT-2 (decoder-only transformer) |
| Parameters | ~19M |
| Context length | 512 tokens |
| Vocabulary size | 49,152 |
| Layers | 8 |
| Attention heads | 8 |
| Hidden size | 256 |
| FFN size | 1024 |
| Tokenizer | SmolLM2-135M (HuggingFaceTB) |
| Training steps | 8,000 |
| Final loss | ~3.6–4.2 |
Hardware: Google Colab, NVIDIA T4 (15GB VRAM)
Dataset mix:
sample-10BT) — deduplicated Common Crawl web textTraining config:
from transformers import AutoTokenizer, GPT2LMHeadModel, pipeline
tokenizer = AutoTokenizer.from_pretrained("Tralalabs/PicoLM-15M")
model = GPT2LMHeadModel.from_pretrained("Tralalabs/PicoLM-15M")
gen = pipeline("text-generation", model=model, tokenizer=tokenizer)
output = gen("Once upon a time", max_new_tokens=100, do_sample=True, temperature=0.8)
print(output[0]["generated_text"])
Prompt: Once upon a time
Once upon a time, there was a little girl named Lily. She loved to play outside and play with her ball. One day, she's friend Lily came to play outside...
Prompt: The history of the internet
The history of the internet. And the new world we have found in the last year of 110 in the world. The group of the people from the American leaders...
Prompt: Artificial intelligence is
Artificial intelligence is not good, but not even not yet in order to bring on the world of the world...
@misc{picolm2026,
author = {Tralalabs},
title = {PicoLM-15M: A Small GPT-style Language Model},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/Tralalabs/PicoLM-15M}
}