Spaces:
Running
Running
| # Puck backlog | |
| Ideas not yet built. Each entry: what, why, and where it slots into the architecture | |
| (engine = pure policy/data, ui = presentation, modes/sim = orchestration). | |
| ## Gestures β sprite dances/reactions keyed to event type | |
| **What:** Puck performs a short gesture matched to *what* he's reporting, not just | |
| flying + speaking. A build passing β a little hop/spin of pride. Tests failing or a | |
| permission block β an urgent shake/alarm wiggle. Discord noise he ignored β a dismissive | |
| shrug. Claude finished β a satisfied stretch. Stale tab β a curious head-tilt. Sleep β | |
| a yawn. The gesture plays during the flight-to + bubble beat. | |
| **Why:** On a busy real desktop the sprite is glanceable; a gesture lets you read the | |
| *kind* of news pre-attentively, before the words. It's the single highest charm-per-line | |
| addition left, and it makes the overlay feel like a creature rather than a notifier. | |
| **Where it slots in:** | |
| - `engine/` β a pure `gestureFor(event | decision)` map: `EventDef.id`/`source` β | |
| gesture name. Data, not logic; pin a couple of mappings with a test. Could also key | |
| off the existing tier/mischief so high-mischief gestures are more theatrical. | |
| - `ui/Sprite.tsx` β a `gesture?: GestureName` prop adding a CSS class (`gesture-hop`, | |
| `gesture-shake`, `gesture-shrug`, β¦); animations on `.puck-bob`/`.puck-face` only | |
| (compositor-friendly transforms, like the existing flap/bob). Clear the class on | |
| animationend so it re-triggers. | |
| - `modes/sim/SimApp.tsx` β set the gesture alongside the existing `flyTo`/`setBubble` | |
| in `fireEvent`; clears when the surface resolves. | |
| **Notes:** keep gestures to transform/opacity so they stay GPU-composited (the whole | |
| sprite layer is). Reuse the per-form structure in `SpriteBody` β gestures should read | |
| on all four forms (mossling/wisp/gremlin/moth), so animate the wrapper, not form-specific | |
| parts. Pairs naturally with the existing mood tint + alert ring. | |
| **Color pulse (same feature, cheapest channel):** transient-tint the alert ring + glow | |
| to a per-event-type hue while a surface is pending β red/urgent for failures & | |
| permission blocks, gold/success for completions, grey/dim for ignored noise. The ring | |
| and `.puck-glow` already read CSS vars (`--accent`, `--glow`), so this is a single | |
| `--alert-hue` override set from the same `engine` map that picks the gesture β one | |
| `{ gesture, hue }` lookup feeds both. Even shipping the color pulse *alone* (before | |
| gesture animation work) is a real readability win. Don't fight the existing mood tint | |
| (`--puck-body` etc.): pulse the ring/glow, leave the body color to mood, so "what kind | |
| of news" (hue) and "how Puck feels lately" (mood) stay separate signals. | |
| ## Ambient quips β Puck reacts to what you're doing | |
| **What:** Occasionally Puck mutters a quip or non-sequitur riffing on your current | |
| context β the active app, a window title, the shape of what you're typing. Not help, | |
| not a summary: flavor. "Still in the auth thicket, I see." / "That's a lot of tabs for | |
| one small human." / a non-sequitur about the dock breathing. Low frequency, skippable, | |
| never twice about the same thing. | |
| **Why:** This is the difference between a notifier that lives in the corner and a | |
| familiar that *shares the room*. It's the most "alive" feature on the list β and the | |
| most dangerous, so it ships last and most carefully. | |
| **Privacy is the feature, not a caveat** (design doc Β§17.3β17.4 are the law here): | |
| - **Local inference ONLY β hard gate.** Context never leaves the machine. If the brain | |
| is the cloud path (Modal/ZeroGPU), this feature is *disabled*, full stop, not degraded. | |
| Enforce in code: the quip path checks the resolved brain is localhost and refuses | |
| otherwise β a loud guard, not a setting. | |
| - **Opt-in, off by default.** A distinct toggle from notifications; the permission copy | |
| says plainly what's read and that it stays local. | |
| - **Ephemeral, never persisted, never a trace, never training data.** Context for a quip | |
| is read, used for one generation, dropped. It must not touch the memory garden, the | |
| trace export, or localStorage. | |
| - **Redact before the model sees it.** Strip obvious secrets (password fields, token-shaped | |
| strings, anything in a field marked sensitive). Prefer *abstractions* over content β | |
| "a long terminal command", "a code file", "a messaging app" β over the literal text. | |
| Quoting back verbatim is the creepy line; stay on the abstract side of it. | |
| - **Never about people or private content.** App categories and your own activity, yes; | |
| the contents of a DM, an email body, a name on screen β no. | |
| **Where it slots in:** | |
| - The desktop watcher (future, Β§9.2: NSWorkspace frontmost app, optional AX/screen) is the | |
| context source β same daemon `/events` path, a new low-priority `context` event kind, or | |
| a separate local-only endpoint that never queues. | |
| - `engine` decides *whether* to quip (rare; respects annoyance budget / presence, reuses | |
| the interruption-taste machinery so an annoying quip trains Puck quieter). | |
| - The quip generation uses the local brain with a tight "one playful aside, β€12 words, | |
| never quote the user" prompt; surfaces through the existing bubble channel. | |
| **Ship order:** after the desktop watcher exists and after a privacy pass. Until then it's | |
| sim-only flavor at most (riffing on the fake desktop, where there's nothing real to leak). | |
| ## Phototaxis β a fairy drawn to stimuli (whimsical wander) | |
| **What:** Puck should behave like a moth/fairy β *pulled* toward activity rather than | |
| drifting at random. Flits toward motion, lingers near what's lively, chases the | |
| occasional shiny thing, then loses interest. | |
| **Why:** The wander is the sprite's resting personality β it's on screen far more than | |
| any bubble. Uniform-random reads as a screensaver; attraction reads as *alive and | |
| curious*. Highest charm-per-effort of the ambient ideas. | |
| **Buildable now β zero new permissions (do this part first):** | |
| Replace the uniform-random wander target in `SimApp` with a weighted pull toward salient | |
| points we already have: | |
| - the **cursor** (occasional gentle follow / curious approach, then retreat β never | |
| clingy; respects `presence`), | |
| - the **last event location** (he lingers where something just happened), | |
| - in the overlay, the **focused window** rect (future: NSWorkspace frontmost-app | |
| position via the daemon β he hangs near where you're working, patrols where you're not). | |
| Keep it a *gradient*, not a leash: weighted-random pick among attractors + noise, so it | |
| stays unpredictable. Lives in the engine as a pure `pickWanderTarget(attractors, rng)`; | |
| the loop already exists. Tune against `presence` (low = aloof, high = follows the action). | |
| **Sensor-gated β backlog, same privacy rules as ambient quips (local-only, ephemeral):** | |
| - **Screen color / motion / "flashing lights":** needs ScreenCaptureKit β the transparent | |
| overlay can't see what's beneath it. A coarse, downsampled brightness/motion map (NOT | |
| readable content) could let him drift toward an area that just changed (a video started, | |
| a notification flashed). Local-only, never stored, abstractions not pixels. | |
| - **Audio reactivity:** mic is a hard no by default; "system audio is playing / its level" | |
| is lighter but still opt-in + local-only. A bass-thump bob or a turn-toward-the-sound | |
| would be delightful but ships last, behind the same gate as quips. | |
| **Smell test for all of it:** attraction must stay *cute*, never *surveillant*. He reacts | |
| to the shape of activity (something moved, something's loud), never to its content. | |
| ## Take-me-there β Puck knows where the activity is, and ferries you to it | |
| **What:** On a real notification, Puck should point you at the *actual* window that needs | |
| you β fly to it if it's on this Space, beckon "follow me" if it's on another β and | |
| **clicking him navigates there** (focuses the app, macOS brings its Space forward). | |
| **Current behavior (the gap):** wire events arrive with `target: null` (the sim's targets | |
| were fake windows), so in the overlay `fireEvent` flies Puck to a *random* screen point. | |
| He has no idea where the source app lives β we never gave him real-window awareness. | |
| This is the design doc's Phase 4 ("patrol the desktops you abandoned" presumes knowing | |
| where they are). | |
| **Phase 1 β click-to-activate (high value, no Space geometry needed):** | |
| - Event carries a **locator**: source app bundle id / pid / window title. The Claude hook | |
| already has `TERM_PROGRAM`, `cwd`, and the calling pid available; `puck-run` knows its | |
| terminal. Add an optional `locator` to the wire schema (`{bundleId?, pid?, title?}`). | |
| - Rust command `activate_target(locator)` -> `NSRunningApplication(bundleIdentifier:).activate` | |
| (or AX focus by pid/title). macOS switches to that app's Space automatically. | |
| - Frontend: clicking Puck while a located surface is pending calls it. This alone delivers | |
| "Puck lit up -> click -> you're where the thing is," across Spaces, without knowing which. | |
| **Phase 2 β same-Space vs other-Space (the fiddly bit):** | |
| - Determine if the target window is on the *current* Space: `CGWindowListCopyWindowInfo` | |
| with `kCGWindowListOptionOnScreenOnly` lists on-screen windows; absent target -> elsewhere. | |
| Robust Space identity needs the semi-private CGSSpace APIs β fragile, optional. | |
| - Same Space -> fly to the window's screen rect (window bounds from CGWindowList). | |
| Other Space -> a "come hither" beckon toward the screen edge (pairs with the gesture | |
| entry), and the click teleports. | |
| **Notes:** locator is metadata, not content β bundle id + window title, never window | |
| *contents* (anti-creep). Activation is an explicit user click (navigation, not an | |
| autonomous action), so it stays inside the safety tiers. Depends on: gesture vocabulary | |
| (beckon) and the future native window watcher (NSWorkspace frontmost / CGWindowList). | |
| ## Camouflage β Puck adapts to what he's floating over | |
| **What:** When Puck drifts over text or busy content, he reacts to his surroundings like | |
| a chameleon/glass-wisp β goes translucent, refracts, or (dream version) "mirrors" the | |
| texture behind him onto his own body. Blends, then pops back when he moves to empty space. | |
| **Why:** Sells "he's really *in* your desktop, not pasted on top." A creature that | |
| responds to its background reads as inhabiting the space. Pairs with the wisp form | |
| (already glass-like). | |
| **Cheap / free now (no sensing):** | |
| - **Shy fade:** lower sprite opacity while stationary over the busy center of the screen, | |
| restore when wandering to the margins β pure CSS/opacity on the existing wander state. | |
| Reads as "blending in" without literally seeing anything. | |
| - **Refraction (maybe free, needs a WKWebView test):** a `backdrop-filter` glass body on | |
| the sprite. In the transparent overlay this *might* sample the real desktop behind the | |
| window (same open question as the speech-bubble blur) β if it does, a refractive/distort | |
| body gives instant chameleon shimmer with zero screen-capture. Test in the Tauri overlay | |
| before committing to it. | |
| **Dream version β literal mirror (screen-capture gated):** | |
| - Sample the screen region directly under the sprite (ScreenCaptureKit), downsample, and | |
| paint it onto his body as living camouflage. Stunning, but it's the same hard gate as | |
| ambient quips / phototaxis sensors: local-only, ephemeral, opt-in, never stored, never | |
| leaves the machine. Texture/color only β never treated as readable content. | |
| **Where it slots in:** `ui/Sprite.tsx` (a `camouflage` intensity prop on `.puck-body`), | |
| driven by `modes/sim` from sprite position vs. screen regions. The shy-fade is a | |
| 20-minute add; refraction is a test-then-maybe; the literal mirror waits for the screen | |
| watcher + privacy pass. | |
| ## Real app icons β known apps, OS-extracted where possible | |
| **What:** Replace the glyph characters (β³ β β β¦) in the sim windows/dock and feed | |
| source-markers with real app icons β Claude, ChatGPT, Chrome, Gmail, Mail, Discord, | |
| Terminal, etc. Looks dramatically more legit, especially in the overlay. | |
| **Pulling the *actual* OS icon (the good version):** | |
| - macOS: `NSWorkspace.shared.icon(forFile: "/Applications/Foo.app")` returns the real | |
| icon for any installed `.app` bundle β a Rust/Tauri command can extract β PNG β hand to | |
| the webview. So Chrome, Mail, Discord, Slack, the host terminal: real icons, free, always | |
| current. | |
| - **The limit you called:** a terminal *binary* (claude, codex) has no bundle and no icon. | |
| Fall back to the **host terminal's** icon (iTerm/Terminal/Ghostty/Warp β which the Claude | |
| hook can report via `TERM_PROGRAM`), or a curated Puck-styled glyph. | |
| - **Web apps with no native app** (ChatGPT, Gmail as a tab): no bundle to extract from β | |
| these need a small **curated bundled icon set** (a dozen SVGs/PNGs). | |
| **So: hybrid.** OS extraction where a bundle exists (Rust command, overlay only), curated | |
| bundled set for web-apps + terminal-binary fallbacks (works everywhere incl. the Space/sim). | |
| **Where it slots in:** | |
| - `ui/Desktop.tsx` (`WIN_DEFS` icons, dock glyphs) and the feed `source` marker β currently | |
| single glyph chars; swap for an `<Icon source=β¦>` that prefers OS-extracted, falls back to | |
| bundled, falls back to glyph. | |
| - Overlay-only Rust command `app_icon(bundleId|path) -> png` for the extraction half. | |
| - Engine `SOURCES`/`EventDef` already carry source identity; add an optional `bundleId` hint. | |
| **Cheap first step (no native):** ship the curated bundled icon set + sourceβicon map; use | |
| glyph only as last resort. The OS-extraction half is an overlay enhancement on top. | |
| ## Local vision β the private, free path for continuous perception | |
| **What:** Run Puck's eyes on-device instead of (or alongside) Modal, so real-screen | |
| perception is private and continuous vision costs ~nothing. Brain seam already supports | |
| it: point PUCK_VISION_URL at a local OpenAI-compatible server. | |
| **Capability is NOT the blocker** (verified 2026-06-07): screen-reading is OCR + light | |
| "what's notable" reasoning β small VLMs excel at it. Options: | |
| - **Holotron-12B local** via llama.cpp + mmproj β llama.cpp merged Nemotron-Nano-12B-v2-VL | |
| (PR #19547). `convert_hf_to_gguf.py --mmproj` β vision projector β `llama-server`. Same | |
| model as cloud, same capability, ~24GB on the 48GB Mac, free. NB: Ollama can't load the | |
| mmproj β must use raw llama-server. Keeps the Nemotron Quest tie. | |
| - **Qwen2.5-VL-7B local** β ~6GB, 95.7 DocVQA, fast; ideal for *continuous ambient* (every | |
| 45s forever on the M4 Max). Loses the Nemotron tie, plenty for screen-reading. | |
| - MiniCPM-V 2.6 (~5.5GB), Moondream2 (1.9B, CPU) as even-lighter fallbacks. | |
| **Recommendation:** Holotron for the showcase + Quest (cloud now β local llama-server | |
| later as the private path); Qwen2.5-VL-7B-local as the cheap continuous engine. The | |
| visionMode "Continuous" tier (built, currently same cloud path) should switch to a local | |
| PUCK_VISION_URL when this lands. | |
| **Where it slots in:** zero engine/frontend change β it's a runtime: spin up | |
| `llama-server --mmproj` (or vLLM/mlx when fixed) and set PUCK_VISION_URL. Plus the | |
| real-screen capture (ScreenCaptureKit, overlay) to feed it actual pixels instead of the | |
| sim snapshot. | |