Cartogemma / README.md
anotheruserishere's picture
Upload folder using huggingface_hub
3c92965 verified
---
title: Cartogemma
emoji: 🗺️
colorFrom: indigo
colorTo: green
sdk: gradio
sdk_version: 5.9.1
app_file: app.py
python_version: 3.11
pinned: false
license: apache-2.0
short_description: Mechanistic probe on Gemma-3-1B-IT
---
# Cartogemma
A faithful Gradio port of `cartographer3.py` / `cartographer_tui.py`.
Default: **`google/gemma-3-270m-it`** (tiny — full head×layer scans are
near-instant). The architecture is **auto-discovered**, so the Model ID box
also accepts other decoder LMs: `google/gemma-3-1b-it`, multimodal
`google/gemma-3-4b-it`, the Gemma-4 family (`google/gemma-4-E2B-it`, …),
`Qwen/Qwen3-0.6B`, Llama, etc.
Four panes:
- **Context** — running token tail.
- **Head Map** — for each layer: per-head pre-projection · logit-lens xray ·
per-head Δ-residual · full-layer Δ-residual (`L_full`).
- **Branches** — top-k next-token continuations (deterministic rollouts).
- **Token Rank Trace** — rank of a chosen token across (head × layer).
REPL-style command bar: `1-N`, `i`, `h * | h L | h L H`, `top`, `rew`, `spark`,
`w`, `l`, `mute`/`unmute`, `muted`, `r`, `s`.
## Two tiers of capability
- **Tier 1 (any HF decoder LM):** logit-lens, per-layer Δ-residual, branches,
rank-pick, inject, rewind. Needs only `output_hidden_states` + a final norm +
an lm_head (or tied embeddings).
- **Tier 2 (standard MHA/GQA attention):** per-head pre-projection, per-head
Δ-residual, head muting, token rank trace. Needs an attention `o_proj` whose
input decomposes as `num_heads × head_dim`. When a model doesn't satisfy this
(fused QKV, MLA, exotic attention), the UI **degrades to Tier 1** rather than
crashing.
## Setup
Most Gemma checkpoints are gated. Set a Space secret named `HF_TOKEN` with a
token that has accepted the relevant model licenses.
Tiny models (270m / E2B) are comfortable on CPU; ZeroGPU / a GPU is much
snappier, and required in practice for 4B+ and the Gemma-4 family.