Tiny-ML Leaderboard

Sub-100M parameter language models, same eval harness, transparent methodology.

Why this exists. The community deserves a single place to compare tiny LMs fairly. We include every model with verifiable benchmarks — ours, our competitors', yours. Submit a model via PR.

Detailed Results

Model	Org	Params	WikiText-2 ↓	BLiMP ↑	ARC-Easy ↑	Training Tokens	Links

Benchmark Overview

CompactAI SupraLabs Axiomic Labs Mihai Popa

BLiMP ↑ (higher is better)

ARC-Easy ↑ (higher is better)

WikiText-2 ↓ (lower is better)

Model Efficiency

Parameters vs Avg Score — high efficiency zone (≥1σ above trend)

Faint dashed line: average trend Bold dashed line: high-efficiency threshold (trend + 1σ) Yellow shaded area: models outperforming expectations for their size

Add your model

Open a PR on this Space with your model's benchmark results and reproduction steps. We require: params, training data provenance, eval harness used, and scores for at least 2 of the 3 benchmarks.