PicoLM
Collection
The PicoLM family made by Tralalabs
• 2 items • Updated
A ~0.48M parameter GPT-2 style causal language model pretrained from scratch using a custom 4096-vocab BPE tokenizer. The smallest model in the PicoLM family. Trained in ~25 minutes on a single NVIDIA T4 GPU.
| Property | Value |
|---|---|
| Architecture | GPT-2 (decoder-only transformer) |
| Parameters | ~0.48M |
| Context length | 256 tokens |
| Vocabulary size | 4,096 (custom BPE) |
| Layers | 4 |
| Attention heads | 4 |
| Hidden size | 64 |
| FFN size | 256 |
| Tokenizer | Custom BPE trained on TinyStories |
| Training steps | 5,000 |
Hardware: Google Colab, NVIDIA T4 (15GB VRAM)
Dataset mix:
Training config:
from transformers import GPT2LMHeadModel, PreTrainedTokenizerFast, GenerationConfig
import torch
tokenizer = PreTrainedTokenizerFast.from_pretrained("Tralalabs/PicoLM-0.5M")
model = GPT2LMHeadModel.from_pretrained("Tralalabs/PicoLM-0.5M")
inputs = tokenizer("Once upon a time", return_tensors="pt")
gen_config = GenerationConfig(
max_new_tokens=60,
do_sample=True,
temperature=0.9,
pad_token_id=tokenizer.eos_token_id,
)
out = model.generate(**inputs, generation_config=gen_config)
print(tokenizer.decode(out[0], skip_special_tokens=True))
Prompt: Once upon a time
Once upon a time the Itfore, or more as he must not of the place. As he could ween her, "I said, and all his head...
Prompt: In the beginning
In the beginning of the "I What did I think you, so the devan the I would have done...
| Component | Params |
|---|---|
| Token embedding (4096 x 64) | 262,144 |
| Position embedding (256 x 64) | 16,384 |
| 4 transformer layers | 198,656 |
| Final LayerNorm | 128 |
| LM head (tied to embedding) | 0 |
| Total | ~477,312 |
English only. All training datasets are English. The model has no meaningful multilingual capability.
| Property | Value |
|---|---|
| Training data cutoff | 2023 (TinyStories generation date) |
| Knowledge cutoff | ~2016 (WikiText-2 Wikipedia snapshot) |
| Real-world knowledge | Effectively none |
| Oldest data | Pre-1928 (Project Gutenberg) |
| Model | Params | Status |
|---|---|---|
| PicoLM-0.5M | 0.48M | Released |
| PicoLM-15M | 19M | Released |
| PicoLM-15M-IT | 19M | Released |
| PicoLM-60M | 60M | Planned |
@misc{picolm2026,
author = {Tralalabs},
title = {PicoLM-0.5M: A Minimum Viable Language Model},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/Tralalabs/PicoLM-0.5M}
}