I Built a Tiny AI That Explains the Universe

Community Article Published March 9, 2026

A fine-tuned 0.8B model that runs on your laptop and explains anything better than most textbooks.

"A towel is about the most massively useful thing an interstellar hitchhiker can have." — Douglas Adams

Douglas Adams was right about towels. He was also, accidentally, right about knowledge.

The best tool you can carry isn't a search engine. It's the ability to understand anything, quickly, without drowning in jargon or waiting for a slow web page to load. That's what I wanted to build. Something small. Something fast. Something that actually makes things click.

I called it Pocket Atlas.

The idea

Most language models are trained to do everything. They write code, draft emails, roleplay as pirates, and occasionally explain things. That last part — the explaining — is what I wanted to isolate and amplify.

Pocket Atlas does one thing:

Take any concept. Make it click.

It answers in a fixed 5-part structure:

What it is       — the honest one-sentence definition
Why it matters   — why any human being should care
How it works     — the mechanism, without the jargon
Simple example   — something you can picture
Key takeaway     — the thing worth remembering

The format is opinionated. That's the point. Good explanations have a shape, and training a model on thousands of well-shaped explanations teaches it that shape.

The base model: Qwen3.5-0.8B

I built on Qwen3.5-0.8B — a 0.8 billion parameter model from Alibaba that punches well above its weight class. At 800M parameters, it's small enough to run on a MacBook, a Raspberry Pi, or an iPhone via Ollama. It's also genuinely capable — a product of the recent wave of small models that have quietly become very good.

I disabled thinking mode (enable_thinking=False). Pocket Atlas gives direct answers. No internal chain-of-thought leakage. Just clean explanations.

The dataset: Atlas Pages

The dataset is called Atlas Pages. It's synthetic — generated by Claude — and lives at cetusian/atlas-pages on HuggingFace.

~18,000 examples across three complementary sources:

Source	Count	What it teaches
Atlas Pages (5-part explanations)	~6,400	The format: structured, warm, precise
arXiv abstracts	~8,000	Technical compression — dense → plain
XSum news summaries	~3,000	Radical brevity — one sentence, complete

Each source teaches a different skill. The synthetic data teaches the house format — the 5-part structure I want the model to adopt. arXiv teaches the model to compress dense technical content without losing the core idea. XSum teaches radical economy of words.

Together, they produce a model that can do all three: structure, compress, simplify.

Generating Atlas Pages

I used the Anthropic Batch API to generate the core dataset. Batch API means no rate limits and 50% cheaper than standard API calls. I submitted ~6,600 topics at once and collected results an hour later.

Topics spanned 13 categories: Science & Math, Physics & Cosmology, Biology, Psychology, Philosophy, Economics, Computing & AI, Everyday Concepts, History, Medicine, Law, Engineering, Culture & Society.

Total generation cost for the dataset: ~$7.

Training: LoRA on 2× A100s

Training used Unsloth — a LoRA fine-tuning library that's genuinely fast — on Modal's cloud infrastructure.

Method:      LoRA (r=16, alpha=16)
Base model:  unsloth/Qwen3.5-0.8B
Dataset:     ~17,000 train / 900 val
Batch size:  64 (2× A100 80GB, torchrun DDP)
Epochs:      1
Steps:       269 (with sequence packing)
LR:          2e-4 cosine decay
Optimizer:   AdamW 8-bit
Precision:   bf16
Train loss:  2.147

The big optimization was sequence packing (packing=True). Instead of padding each sequence to the maximum length, packing fills each batch slot with multiple shorter sequences end-to-end. This eliminated padding waste entirely and cut the step count from ~2,145 to 269 — a 8× reduction.

The second optimization was moving from a notebook to a Modal Function with torchrun DDP across both A100s. Effective batch size went from 2 to 64. Training time: ~68 minutes.

Total compute cost: ~$2.50.

The results

Running on Apple Silicon (MLX):

Generation speed: ~52 tokens/second
Peak memory: 1.6 GB

Ask it anything:

Explain the Fermi paradox.

What It Is: The Fermi paradox is the puzzling observation that, despite the high probability of extraterrestrial civilizations existing, we have found no evidence of them.

Why It Matters: It forces us to confront deep questions: Is life rare? Do civilizations destroy themselves? Are we simply not listening hard enough?

How It Works: Physicist Enrico Fermi reasoned: the universe is 13.8 billion years old, there are hundreds of billions of stars like our Sun, many with Earth-like planets — so where is everyone? The math says they should be everywhere. The silence says otherwise.

Simple Example: Imagine a huge party where statistically thousands of people should be in the kitchen, but you walk in and it's completely empty. You start to wonder: did everyone leave early? Are you the only one who showed up?

Key Takeaway: The Fermi paradox isn't a proof that we're alone — it's a proof that something in our assumptions is wrong. Which assumption? That's the question that keeps astronomers up at night.

Total cost

	Cost
Dataset generation (Batch API)	~$6.86
arXiv + XSum (HuggingFace)	$0
Modal training (2× A100, 68 min)	~$2.50
Total	~$9.36

A specialized edge AI explainer, trained from scratch, for less than the price of a lunch.

Running it yourself

MLX (Apple Silicon):

from mlx_lm import load, generate

model, tokenizer = load("cetusian/pocket-atlas-0.8b")
messages = [
    {"role": "system", "content": "You explain ideas clearly, warmly, and precisely. Use this format: What it is, Why it matters, How it works, Simple example, Key takeaway."},
    {"role": "user", "content": "Explain entropy."}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False)
response = generate(model, tokenizer, prompt=prompt, max_tokens=600, verbose=True)

Transformers:

from transformers import pipeline
pipe = pipeline("text-generation", model="cetusian/pocket-atlas-0.8b")

GGUF (Ollama / llama.cpp): Download pocket-atlas-q4_k_m.gguf from the model page.

What's next

The current model was trained on 18k examples. The next version will add:

Angle variations — ELI5, analogy-only, misconceptions, applications (~4,000 examples)
Multi-turn dialogues — follow-up questions, going deeper (~3,000 examples)
More topic categories — Neuroscience, Linguistics, Game Theory, Cryptography, and more (~2,700 topics)

Total dataset target: ~28,000 examples. Training the 2B variant is also on the roadmap.

Links

Don't panic.

Models mentioned in this article 1

Datasets mentioned in this article 1

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote