Atome LM

A reference implementation of a routed-ternary tiny language model with a bit-exact Python ↔ C99 inference engine, sized for microcontroller-class RAM budgets.

The contribution is integration, not a new architecture: a complete train → ternary export → base-3 packing → C99 inference path, with bit-exact Python ↔ C parity enforced by tests. It combines three known ideas — ternary weights (BitNet b1.58), a per-token-routed 3-pathway block (Hymba, MossNet), and a byte tokenizer at super-tiny scale (Guertler 2024).

Code: https://github.com/TilelliLab/atome-lm
Project home / live in-browser demo: https://atomelm.com
License: Apache-2.0 (code, weights, everything)

⚠️ This is a research artifact, not a product or a general chatbot. Read the "Honest results" section below before citing any number. The honesty dossier lives in HONEST_RESULTS.md in the source repo.

Files in this repo

File	What it is
`atome_944k.bin` (272 KB)	Packed `ATOME01` C-engine blob, ternary, loadable directly by the Atome C99 engine
`atome_1m_v1.pt` (3.7 MB)	PyTorch source checkpoint (944,640 params) that produced the blob; use to fine-tune or re-export
`vanilla_1m_v1.pt` (3.7 MB)	FP32 vanilla-GPT baseline (950,608 params) — shipped so you can reproduce the 944K reversal A/B
`*.train.json`	Every-1000-step training logs for both checkpoints (every reported number is auditable)
`config.json`	Architecture hyperparameters + provenance for all three checkpoints
`SHA256SUMS`	Checksums for the three weight files

Honest results — read this before citing anything

All numbers are single-seed, from the training logs shipped alongside.

Regime	Atome ternary	Vanilla FP32 (param-fair)	Verdict
60K (MCU target)	6.31 ppl	8.12 ppl	Atome wins −22% ppl (−52% at flash-fair budget)
944K (these checkpoints)	val 1.0545 / 2.87 ppl	val 0.9337 / 2.54 ppl	Vanilla wins by ~11%

The 944K result reverses. At 944K parameters the FP32 vanilla baseline beats Atome by ~11% in val loss and perplexity, same recipe / same val slice / same seed. Atome's bet is the sub-1M, MCU-class regime: the 3-pathway inductive bias substitutes for capacity at small scale and constrains it above ~1M. This is the most important honest finding in the kit — it is not "tiny ternary beats everything."

The bundled 944K checkpoint is here to make the architecture runnable, not to set a quality bar. It is narrow, single-corpus (TinyStories), and sometimes incoherent.

What is NOT measured / NOT claimed

Single seed only. No multi-seed variance yet.
MCU parity is QEMU (ARM Cortex-M3, MPS2-AN385) to FP32 epsilon, plus a reproducible real-silicon demo: the 944K checkpoint runs on a physical ESP32-WROOM-32, fully offline, ~1 tok/s — see hardware/esp32-wroom32 (prebuilt binary + serial log + one-command flash). That demo is a bare proof-of-execution, not a benchmark win and not productized bring-up; on a no-PSRAM ESP32 the 944K fits only with a short (seq=24) context window. The MCU claim stays regime-dependent (it holds at the ~60K engine-default config, not at 944K), and no same-chip head-to-head vs another MCU LM has been run.
Router-entropy is exposed for free as a per-token uncertainty signal, but its calibration is unmeasured at this scale.

Usage

This is a custom architecture, not a transformers AutoModel. Get the code from the source repo, then load the PyTorch checkpoint:

git clone https://github.com/TilelliLab/atome-lm
cd atome-lm && pip install -e .      # Python >=3.10, PyTorch >=2.0

import torch
from atome_llm.core.atome_lm import AtomeLM

ckpt = torch.load("atome_1m_v1.pt", map_location="cpu", weights_only=False)
model = AtomeLM(**ckpt["config"])    # vocab=256, d_model=256, n_layers=8, d_head=64, top_k=4
model.load_state_dict(ckpt["state_dict"])
model.eval()

ids = torch.randint(0, 256, (1, 32))          # byte-level: ids are raw bytes 0-255
logits = model(ids)                            # (1, 32, 256)
ent_per_layer = model.router_entropies(ids)    # free per-token uncertainty signal

For microcontroller deployment, load atome_944k.bin directly with the Atome C99 engine (atome_load(...)) shipped in the source repo's c_engine/.

Citation

@software{atome_llm_2026,
  title  = {Atome LM: a tiny ternary language model for microcontroller deployment},
  author = {Atome LM contributors},
  year   = {2026},
  note   = {Apache 2.0, https://atomelm.com},
  url    = {https://github.com/TilelliLab/atome-lm}
}

Downloads last month: 35

Papers for TilelliLab/atome-lm