Spaces:
Running on Zero
Running on Zero
| title: Cartogemma | |
| emoji: 🗺️ | |
| colorFrom: indigo | |
| colorTo: green | |
| sdk: gradio | |
| sdk_version: 5.9.1 | |
| app_file: app.py | |
| python_version: 3.11 | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: Mechanistic probe on Gemma-3-1B-IT | |
| # Cartogemma | |
| A faithful Gradio port of `cartographer3.py` / `cartographer_tui.py`. | |
| Default: **`google/gemma-3-270m-it`** (tiny — full head×layer scans are | |
| near-instant). The architecture is **auto-discovered**, so the Model ID box | |
| also accepts other decoder LMs: `google/gemma-3-1b-it`, multimodal | |
| `google/gemma-3-4b-it`, the Gemma-4 family (`google/gemma-4-E2B-it`, …), | |
| `Qwen/Qwen3-0.6B`, Llama, etc. | |
| Four panes: | |
| - **Context** — running token tail. | |
| - **Head Map** — for each layer: per-head pre-projection · logit-lens xray · | |
| per-head Δ-residual · full-layer Δ-residual (`L_full`). | |
| - **Branches** — top-k next-token continuations (deterministic rollouts). | |
| - **Token Rank Trace** — rank of a chosen token across (head × layer). | |
| REPL-style command bar: `1-N`, `i`, `h * | h L | h L H`, `top`, `rew`, `spark`, | |
| `w`, `l`, `mute`/`unmute`, `muted`, `r`, `s`. | |
| ## Two tiers of capability | |
| - **Tier 1 (any HF decoder LM):** logit-lens, per-layer Δ-residual, branches, | |
| rank-pick, inject, rewind. Needs only `output_hidden_states` + a final norm + | |
| an lm_head (or tied embeddings). | |
| - **Tier 2 (standard MHA/GQA attention):** per-head pre-projection, per-head | |
| Δ-residual, head muting, token rank trace. Needs an attention `o_proj` whose | |
| input decomposes as `num_heads × head_dim`. When a model doesn't satisfy this | |
| (fused QKV, MLA, exotic attention), the UI **degrades to Tier 1** rather than | |
| crashing. | |
| ## Setup | |
| Most Gemma checkpoints are gated. Set a Space secret named `HF_TOKEN` with a | |
| token that has accepted the relevant model licenses. | |
| Tiny models (270m / E2B) are comfortable on CPU; ZeroGPU / a GPU is much | |
| snappier, and required in practice for 4B+ and the Gemma-4 family. | |