I Built a Tiny AI That Explains the Universe

Community Article Published March 9, 2026

A fine-tuned 0.8B model that runs on your laptop and explains anything better than most textbooks.


"A towel is about the most massively useful thing an interstellar hitchhiker can have." — Douglas Adams

Douglas Adams was right about towels. He was also, accidentally, right about knowledge.

The best tool you can carry isn't a search engine. It's the ability to understand anything, quickly, without drowning in jargon or waiting for a slow web page to load. That's what I wanted to build. Something small. Something fast. Something that actually makes things click.

I called it Pocket Atlas.


The idea

Most language models are trained to do everything. They write code, draft emails, roleplay as pirates, and occasionally explain things. That last part — the explaining — is what I wanted to isolate and amplify.

Pocket Atlas does one thing:

Take any concept. Make it click.

It answers in a fixed 5-part structure:

What it is       — the honest one-sentence definition
Why it matters   — why any human being should care
How it works     — the mechanism, without the jargon
Simple example   — something you can picture
Key takeaway     — the thing worth remembering

The format is opinionated. That's the point. Good explanations have a shape, and training a model on thousands of well-shaped explanations teaches it that shape.


The base model: Qwen3.5-0.8B

I built on Qwen3.5-0.8B — a 0.8 billion parameter model from Alibaba that punches well above its weight class. At 800M parameters, it's small enough to run on a MacBook, a Raspberry Pi, or an iPhone via Ollama. It's also genuinely capable — a product of the recent wave of small models that have quietly become very good.

I disabled thinking mode (enable_thinking=False). Pocket Atlas gives direct answers. No internal chain-of-thought leakage. Just clean explanations.


The dataset: Atlas Pages

The dataset is called Atlas Pages. It's synthetic — generated by Claude — and lives at cetusian/atlas-pages on HuggingFace.

~18,000 examples across three complementary sources:

Source Count What it teaches
Atlas Pages (5-part explanations) ~6,400 The format: structured, warm, precise
arXiv abstracts ~8,000 Technical compression — dense → plain
XSum news summaries ~3,000 Radical brevity — one sentence, complete

Each source teaches a different skill. The synthetic data teaches the house format — the 5-part structure I want the model to adopt. arXiv teaches the model to compress dense technical content without losing the core idea. XSum teaches radical economy of words.

Together, they produce a model that can do all three: structure, compress, simplify.

Generating Atlas Pages

I used the Anthropic Batch API to generate the core dataset. Batch API means no rate limits and 50% cheaper than standard API calls. I submitted ~6,600 topics at once and collected results an hour later.

Topics spanned 13 categories: Science & Math, Physics & Cosmology, Biology, Psychology, Philosophy, Economics, Computing & AI, Everyday Concepts, History, Medicine, Law, Engineering, Culture & Society.

Total generation cost for the dataset: ~$7.


Training: LoRA on 2× A100s

Training used Unsloth — a LoRA fine-tuning library that's genuinely fast — on Modal's cloud infrastructure.

Method:      LoRA (r=16, alpha=16)
Base model:  unsloth/Qwen3.5-0.8B
Dataset:     ~17,000 train / 900 val
Batch size:  64 (2× A100 80GB, torchrun DDP)
Epochs:      1
Steps:       269 (with sequence packing)
LR:          2e-4 cosine decay
Optimizer:   AdamW 8-bit
Precision:   bf16
Train loss:  2.147

The big optimization was sequence packing (packing=True). Instead of padding each sequence to the maximum length, packing fills each batch slot with multiple shorter sequences end-to-end. This eliminated padding waste entirely and cut the step count from ~2,145 to 269 — a 8× reduction.

The second optimization was moving from a notebook to a Modal Function with torchrun DDP across both A100s. Effective batch size went from 2 to 64. Training time: ~68 minutes.

Total compute cost: ~$2.50.


The results

Running on Apple Silicon (MLX):

Generation speed: ~52 tokens/second
Peak memory: 1.6 GB

Ask it anything:

Explain the Fermi paradox.

What It Is: The Fermi paradox is the puzzling observation that, despite the high probability of extraterrestrial civilizations existing, we have found no evidence of them.

Why It Matters: It forces us to confront deep questions: Is life rare? Do civilizations destroy themselves? Are we simply not listening hard enough?

How It Works: Physicist Enrico Fermi reasoned: the universe is 13.8 billion years old, there are hundreds of billions of stars like our Sun, many with Earth-like planets — so where is everyone? The math says they should be everywhere. The silence says otherwise.

Simple Example: Imagine a huge party where statistically thousands of people should be in the kitchen, but you walk in and it's completely empty. You start to wonder: did everyone leave early? Are you the only one who showed up?

Key Takeaway: The Fermi paradox isn't a proof that we're alone — it's a proof that something in our assumptions is wrong. Which assumption? That's the question that keeps astronomers up at night.


Total cost

Cost
Dataset generation (Batch API) ~$6.86
arXiv + XSum (HuggingFace) $0
Modal training (2× A100, 68 min) ~$2.50
Total ~$9.36

A specialized edge AI explainer, trained from scratch, for less than the price of a lunch.


Running it yourself

MLX (Apple Silicon):

from mlx_lm import load, generate

model, tokenizer = load("cetusian/pocket-atlas-0.8b")
messages = [
    {"role": "system", "content": "You explain ideas clearly, warmly, and precisely. Use this format: What it is, Why it matters, How it works, Simple example, Key takeaway."},
    {"role": "user", "content": "Explain entropy."}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False)
response = generate(model, tokenizer, prompt=prompt, max_tokens=600, verbose=True)

Transformers:

from transformers import pipeline
pipe = pipeline("text-generation", model="cetusian/pocket-atlas-0.8b")

GGUF (Ollama / llama.cpp): Download pocket-atlas-q4_k_m.gguf from the model page.


What's next

The current model was trained on 18k examples. The next version will add:

  • Angle variations — ELI5, analogy-only, misconceptions, applications (~4,000 examples)
  • Multi-turn dialogues — follow-up questions, going deeper (~3,000 examples)
  • More topic categories — Neuroscience, Linguistics, Game Theory, Cryptography, and more (~2,700 topics)

Total dataset target: ~28,000 examples. Training the 2B variant is also on the roadmap.


Links


Don't panic.

Community

Sign up or log in to comment