File size: 4,821 Bytes

9e3a160

---
license: apache-2.0
library_name: pytorch
pipeline_tag: text-generation
tags:
  - ternary
  - bitnet
  - microcontroller
  - edge-ai
  - tinyml
  - byte-level
  - language-model
  - routed-architecture
---

# Atome LM

A reference implementation of a **routed-ternary tiny language model** with a bit-exact
Python ↔ C99 inference engine, sized for **microcontroller-class RAM budgets**.

The contribution is **integration, not a new architecture**: a complete
train → ternary export → base-3 packing → C99 inference path, with bit-exact Python ↔ C
parity enforced by tests. It combines three known ideas — ternary weights
([BitNet b1.58](https://arxiv.org/abs/2402.17764)), a per-token-routed 3-pathway block
([Hymba](https://arxiv.org/abs/2411.13676), [MossNet](https://arxiv.org/abs/2510.26182)),
and a byte tokenizer at super-tiny scale ([Guertler 2024](https://arxiv.org/abs/2405.14159)).

- **Code:** https://github.com/TilelliLab/atome-lm
- **Project home / live in-browser demo:** https://atomelm.com
- **License:** Apache-2.0 (code, weights, everything)

> ⚠️ This is a **research artifact, not a product or a general chatbot.** Read the
> "Honest results" section below before citing any number. The honesty dossier lives in
> [`HONEST_RESULTS.md`](https://github.com/TilelliLab/atome-lm/blob/main/HONEST_RESULTS.md)
> in the source repo.

## Files in this repo

| File | What it is |
|---|---|
| `atome_944k.bin` (272 KB) | Packed `ATOME01` C-engine blob, ternary, loadable directly by the Atome C99 engine |
| `atome_1m_v1.pt` (3.7 MB) | PyTorch source checkpoint (944,640 params) that produced the blob; use to fine-tune or re-export |
| `vanilla_1m_v1.pt` (3.7 MB) | FP32 vanilla-GPT baseline (950,608 params) — shipped so you can reproduce the 944K reversal A/B |
| `*.train.json` | Every-1000-step training logs for both checkpoints (every reported number is auditable) |
| `config.json` | Architecture hyperparameters + provenance for all three checkpoints |
| `SHA256SUMS` | Checksums for the three weight files |

## Honest results — read this before citing anything

All numbers are **single-seed**, from the training logs shipped alongside.

| Regime | Atome ternary | Vanilla FP32 (param-fair) | Verdict |
|---|---|---|---|
| **60K (MCU target)** | 6.31 ppl | 8.12 ppl | **Atome wins −22% ppl** (−52% at flash-fair budget) |
| **944K (these checkpoints)** | val 1.0545 / 2.87 ppl | val 0.9337 / 2.54 ppl | **Vanilla wins by ~11%** |

**The 944K result reverses.** At 944K parameters the FP32 vanilla baseline *beats* Atome by
~11% in val loss and perplexity, same recipe / same val slice / same seed. Atome's bet is the
**sub-1M, MCU-class regime**: the 3-pathway inductive bias substitutes for capacity at small
scale and *constrains* it above ~1M. This is the most important honest finding in the kit —
it is **not** "tiny ternary beats everything."

The bundled 944K checkpoint is here to make the architecture **runnable**, not to set a
quality bar. It is narrow, single-corpus (TinyStories), and sometimes incoherent.

### What is NOT measured / NOT claimed
- **Single seed only.** No multi-seed variance yet.
- **MCU parity is QEMU only** (ARM Cortex-M3, MPS2-AN385), to FP32 epsilon. **No silicon
  bring-up** is done in this repository. The RP2040 demo exceeds 264 KB SRAM at 944K — the
  MCU claim is regime-dependent (it holds at the ~60K engine-default config, not at 944K).
- **Router-entropy** is exposed for free as a per-token uncertainty signal, but its
  **calibration is unmeasured at this scale**.

## Usage

This is a **custom architecture**, not a `transformers` AutoModel. Get the code from the
source repo, then load the PyTorch checkpoint:

```bash
git clone https://github.com/TilelliLab/atome-lm
cd atome-lm && pip install -e .      # Python >=3.10, PyTorch >=2.0
```

```python
import torch
from atome_llm.core.atome_lm import AtomeLM

ckpt = torch.load("atome_1m_v1.pt", map_location="cpu", weights_only=False)
model = AtomeLM(**ckpt["config"])    # vocab=256, d_model=256, n_layers=8, d_head=64, top_k=4
model.load_state_dict(ckpt["state_dict"])
model.eval()

ids = torch.randint(0, 256, (1, 32))          # byte-level: ids are raw bytes 0-255
logits = model(ids)                            # (1, 32, 256)
ent_per_layer = model.router_entropies(ids)    # free per-token uncertainty signal
```

For microcontroller deployment, load `atome_944k.bin` directly with the Atome C99 engine
(`atome_load(...)`) shipped in the source repo's `c_engine/`.

## Citation

```bibtex
@software{atome_llm_2026,
  title  = {Atome LM: a tiny ternary language model for microcontroller deployment},
  author = {Atome LM contributors},
  year   = {2026},
  note   = {Apache 2.0, https://atomelm.com},
  url    = {https://github.com/TilelliLab/atome-lm}
}
```