# Puck backlog Ideas not yet built. Each entry: what, why, and where it slots into the architecture (engine = pure policy/data, ui = presentation, modes/sim = orchestration). ## Gestures — sprite dances/reactions keyed to event type **What:** Puck performs a short gesture matched to *what* he's reporting, not just flying + speaking. A build passing → a little hop/spin of pride. Tests failing or a permission block → an urgent shake/alarm wiggle. Discord noise he ignored → a dismissive shrug. Claude finished → a satisfied stretch. Stale tab → a curious head-tilt. Sleep → a yawn. The gesture plays during the flight-to + bubble beat. **Why:** On a busy real desktop the sprite is glanceable; a gesture lets you read the *kind* of news pre-attentively, before the words. It's the single highest charm-per-line addition left, and it makes the overlay feel like a creature rather than a notifier. **Where it slots in:** - `engine/` — a pure `gestureFor(event | decision)` map: `EventDef.id`/`source` → gesture name. Data, not logic; pin a couple of mappings with a test. Could also key off the existing tier/mischief so high-mischief gestures are more theatrical. - `ui/Sprite.tsx` — a `gesture?: GestureName` prop adding a CSS class (`gesture-hop`, `gesture-shake`, `gesture-shrug`, …); animations on `.puck-bob`/`.puck-face` only (compositor-friendly transforms, like the existing flap/bob). Clear the class on animationend so it re-triggers. - `modes/sim/SimApp.tsx` — set the gesture alongside the existing `flyTo`/`setBubble` in `fireEvent`; clears when the surface resolves. **Notes:** keep gestures to transform/opacity so they stay GPU-composited (the whole sprite layer is). Reuse the per-form structure in `SpriteBody` — gestures should read on all four forms (mossling/wisp/gremlin/moth), so animate the wrapper, not form-specific parts. Pairs naturally with the existing mood tint + alert ring. **Color pulse (same feature, cheapest channel):** transient-tint the alert ring + glow to a per-event-type hue while a surface is pending — red/urgent for failures & permission blocks, gold/success for completions, grey/dim for ignored noise. The ring and `.puck-glow` already read CSS vars (`--accent`, `--glow`), so this is a single `--alert-hue` override set from the same `engine` map that picks the gesture — one `{ gesture, hue }` lookup feeds both. Even shipping the color pulse *alone* (before gesture animation work) is a real readability win. Don't fight the existing mood tint (`--puck-body` etc.): pulse the ring/glow, leave the body color to mood, so "what kind of news" (hue) and "how Puck feels lately" (mood) stay separate signals. ## Ambient quips — Puck reacts to what you're doing **What:** Occasionally Puck mutters a quip or non-sequitur riffing on your current context — the active app, a window title, the shape of what you're typing. Not help, not a summary: flavor. "Still in the auth thicket, I see." / "That's a lot of tabs for one small human." / a non-sequitur about the dock breathing. Low frequency, skippable, never twice about the same thing. **Why:** This is the difference between a notifier that lives in the corner and a familiar that *shares the room*. It's the most "alive" feature on the list — and the most dangerous, so it ships last and most carefully. **Privacy is the feature, not a caveat** (design doc §17.3–17.4 are the law here): - **Local inference ONLY — hard gate.** Context never leaves the machine. If the brain is the cloud path (Modal/ZeroGPU), this feature is *disabled*, full stop, not degraded. Enforce in code: the quip path checks the resolved brain is localhost and refuses otherwise — a loud guard, not a setting. - **Opt-in, off by default.** A distinct toggle from notifications; the permission copy says plainly what's read and that it stays local. - **Ephemeral, never persisted, never a trace, never training data.** Context for a quip is read, used for one generation, dropped. It must not touch the memory garden, the trace export, or localStorage. - **Redact before the model sees it.** Strip obvious secrets (password fields, token-shaped strings, anything in a field marked sensitive). Prefer *abstractions* over content — "a long terminal command", "a code file", "a messaging app" — over the literal text. Quoting back verbatim is the creepy line; stay on the abstract side of it. - **Never about people or private content.** App categories and your own activity, yes; the contents of a DM, an email body, a name on screen — no. **Where it slots in:** - The desktop watcher (future, §9.2: NSWorkspace frontmost app, optional AX/screen) is the context source — same daemon `/events` path, a new low-priority `context` event kind, or a separate local-only endpoint that never queues. - `engine` decides *whether* to quip (rare; respects annoyance budget / presence, reuses the interruption-taste machinery so an annoying quip trains Puck quieter). - The quip generation uses the local brain with a tight "one playful aside, ≤12 words, never quote the user" prompt; surfaces through the existing bubble channel. **Ship order:** after the desktop watcher exists and after a privacy pass. Until then it's sim-only flavor at most (riffing on the fake desktop, where there's nothing real to leak). ## Phototaxis — a fairy drawn to stimuli (whimsical wander) **What:** Puck should behave like a moth/fairy — *pulled* toward activity rather than drifting at random. Flits toward motion, lingers near what's lively, chases the occasional shiny thing, then loses interest. **Why:** The wander is the sprite's resting personality — it's on screen far more than any bubble. Uniform-random reads as a screensaver; attraction reads as *alive and curious*. Highest charm-per-effort of the ambient ideas. **Buildable now — zero new permissions (do this part first):** Replace the uniform-random wander target in `SimApp` with a weighted pull toward salient points we already have: - the **cursor** (occasional gentle follow / curious approach, then retreat — never clingy; respects `presence`), - the **last event location** (he lingers where something just happened), - in the overlay, the **focused window** rect (future: NSWorkspace frontmost-app position via the daemon — he hangs near where you're working, patrols where you're not). Keep it a *gradient*, not a leash: weighted-random pick among attractors + noise, so it stays unpredictable. Lives in the engine as a pure `pickWanderTarget(attractors, rng)`; the loop already exists. Tune against `presence` (low = aloof, high = follows the action). **Sensor-gated — backlog, same privacy rules as ambient quips (local-only, ephemeral):** - **Screen color / motion / "flashing lights":** needs ScreenCaptureKit — the transparent overlay can't see what's beneath it. A coarse, downsampled brightness/motion map (NOT readable content) could let him drift toward an area that just changed (a video started, a notification flashed). Local-only, never stored, abstractions not pixels. - **Audio reactivity:** mic is a hard no by default; "system audio is playing / its level" is lighter but still opt-in + local-only. A bass-thump bob or a turn-toward-the-sound would be delightful but ships last, behind the same gate as quips. **Smell test for all of it:** attraction must stay *cute*, never *surveillant*. He reacts to the shape of activity (something moved, something's loud), never to its content. ## Take-me-there — Puck knows where the activity is, and ferries you to it **What:** On a real notification, Puck should point you at the *actual* window that needs you — fly to it if it's on this Space, beckon "follow me" if it's on another — and **clicking him navigates there** (focuses the app, macOS brings its Space forward). **Current behavior (the gap):** wire events arrive with `target: null` (the sim's targets were fake windows), so in the overlay `fireEvent` flies Puck to a *random* screen point. He has no idea where the source app lives — we never gave him real-window awareness. This is the design doc's Phase 4 ("patrol the desktops you abandoned" presumes knowing where they are). **Phase 1 — click-to-activate (high value, no Space geometry needed):** - Event carries a **locator**: source app bundle id / pid / window title. The Claude hook already has `TERM_PROGRAM`, `cwd`, and the calling pid available; `puck-run` knows its terminal. Add an optional `locator` to the wire schema (`{bundleId?, pid?, title?}`). - Rust command `activate_target(locator)` -> `NSRunningApplication(bundleIdentifier:).activate` (or AX focus by pid/title). macOS switches to that app's Space automatically. - Frontend: clicking Puck while a located surface is pending calls it. This alone delivers "Puck lit up -> click -> you're where the thing is," across Spaces, without knowing which. **Phase 2 — same-Space vs other-Space (the fiddly bit):** - Determine if the target window is on the *current* Space: `CGWindowListCopyWindowInfo` with `kCGWindowListOptionOnScreenOnly` lists on-screen windows; absent target -> elsewhere. Robust Space identity needs the semi-private CGSSpace APIs — fragile, optional. - Same Space -> fly to the window's screen rect (window bounds from CGWindowList). Other Space -> a "come hither" beckon toward the screen edge (pairs with the gesture entry), and the click teleports. **Notes:** locator is metadata, not content — bundle id + window title, never window *contents* (anti-creep). Activation is an explicit user click (navigation, not an autonomous action), so it stays inside the safety tiers. Depends on: gesture vocabulary (beckon) and the future native window watcher (NSWorkspace frontmost / CGWindowList). ## Camouflage — Puck adapts to what he's floating over **What:** When Puck drifts over text or busy content, he reacts to his surroundings like a chameleon/glass-wisp — goes translucent, refracts, or (dream version) "mirrors" the texture behind him onto his own body. Blends, then pops back when he moves to empty space. **Why:** Sells "he's really *in* your desktop, not pasted on top." A creature that responds to its background reads as inhabiting the space. Pairs with the wisp form (already glass-like). **Cheap / free now (no sensing):** - **Shy fade:** lower sprite opacity while stationary over the busy center of the screen, restore when wandering to the margins — pure CSS/opacity on the existing wander state. Reads as "blending in" without literally seeing anything. - **Refraction (maybe free, needs a WKWebView test):** a `backdrop-filter` glass body on the sprite. In the transparent overlay this *might* sample the real desktop behind the window (same open question as the speech-bubble blur) — if it does, a refractive/distort body gives instant chameleon shimmer with zero screen-capture. Test in the Tauri overlay before committing to it. **Dream version — literal mirror (screen-capture gated):** - Sample the screen region directly under the sprite (ScreenCaptureKit), downsample, and paint it onto his body as living camouflage. Stunning, but it's the same hard gate as ambient quips / phototaxis sensors: local-only, ephemeral, opt-in, never stored, never leaves the machine. Texture/color only — never treated as readable content. **Where it slots in:** `ui/Sprite.tsx` (a `camouflage` intensity prop on `.puck-body`), driven by `modes/sim` from sprite position vs. screen regions. The shy-fade is a 20-minute add; refraction is a test-then-maybe; the literal mirror waits for the screen watcher + privacy pass. ## Real app icons — known apps, OS-extracted where possible **What:** Replace the glyph characters (✳ ◍ ✉ …) in the sim windows/dock and feed source-markers with real app icons — Claude, ChatGPT, Chrome, Gmail, Mail, Discord, Terminal, etc. Looks dramatically more legit, especially in the overlay. **Pulling the *actual* OS icon (the good version):** - macOS: `NSWorkspace.shared.icon(forFile: "/Applications/Foo.app")` returns the real icon for any installed `.app` bundle — a Rust/Tauri command can extract → PNG → hand to the webview. So Chrome, Mail, Discord, Slack, the host terminal: real icons, free, always current. - **The limit you called:** a terminal *binary* (claude, codex) has no bundle and no icon. Fall back to the **host terminal's** icon (iTerm/Terminal/Ghostty/Warp — which the Claude hook can report via `TERM_PROGRAM`), or a curated Puck-styled glyph. - **Web apps with no native app** (ChatGPT, Gmail as a tab): no bundle to extract from — these need a small **curated bundled icon set** (a dozen SVGs/PNGs). **So: hybrid.** OS extraction where a bundle exists (Rust command, overlay only), curated bundled set for web-apps + terminal-binary fallbacks (works everywhere incl. the Space/sim). **Where it slots in:** - `ui/Desktop.tsx` (`WIN_DEFS` icons, dock glyphs) and the feed `source` marker — currently single glyph chars; swap for an `` that prefers OS-extracted, falls back to bundled, falls back to glyph. - Overlay-only Rust command `app_icon(bundleId|path) -> png` for the extraction half. - Engine `SOURCES`/`EventDef` already carry source identity; add an optional `bundleId` hint. **Cheap first step (no native):** ship the curated bundled icon set + source→icon map; use glyph only as last resort. The OS-extraction half is an overlay enhancement on top. ## Local vision — the private, free path for continuous perception **What:** Run Puck's eyes on-device instead of (or alongside) Modal, so real-screen perception is private and continuous vision costs ~nothing. Brain seam already supports it: point PUCK_VISION_URL at a local OpenAI-compatible server. **Capability is NOT the blocker** (verified 2026-06-07): screen-reading is OCR + light "what's notable" reasoning — small VLMs excel at it. Options: - **Holotron-12B local** via llama.cpp + mmproj — llama.cpp merged Nemotron-Nano-12B-v2-VL (PR #19547). `convert_hf_to_gguf.py --mmproj` → vision projector → `llama-server`. Same model as cloud, same capability, ~24GB on the 48GB Mac, free. NB: Ollama can't load the mmproj — must use raw llama-server. Keeps the Nemotron Quest tie. - **Qwen2.5-VL-7B local** — ~6GB, 95.7 DocVQA, fast; ideal for *continuous ambient* (every 45s forever on the M4 Max). Loses the Nemotron tie, plenty for screen-reading. - MiniCPM-V 2.6 (~5.5GB), Moondream2 (1.9B, CPU) as even-lighter fallbacks. **Recommendation:** Holotron for the showcase + Quest (cloud now → local llama-server later as the private path); Qwen2.5-VL-7B-local as the cheap continuous engine. The visionMode "Continuous" tier (built, currently same cloud path) should switch to a local PUCK_VISION_URL when this lands. **Where it slots in:** zero engine/frontend change — it's a runtime: spin up `llama-server --mmproj` (or vLLM/mlx when fixed) and set PUCK_VISION_URL. Plus the real-screen capture (ScreenCaptureKit, overlay) to feed it actual pixels instead of the sim snapshot.