puck / README.md
vu1n's picture
Link the published Field Notes article + X post
c72d980 verified
|
Raw
History Blame Contribute Delete
3.75 kB
---
title: Puck
emoji: 🧚
colorFrom: green
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: mit
short_description: A mischievous desktop fairy that comments on your work
tags:
- build-small
- thousand-token-wood
- off-brand
- off-the-grid
- field-notes
- best-demo
- track:wood
- sponsor:nvidia
- sponsor:modal
- achievement:offgrid
- achievement:offbrand
- achievement:llama
- achievement:fieldnotes
---
<!-- TRACK + BADGES: the tag slugs above are my best guess. Before submitting, paste this
README into the field-guide validator (https://build-small-hackathon-field-guide.hf.space/submit)
and use the EXACT slugs it expects for the Thousand Token Wood track + each badge. -->
# 🧚 Puck β€” a desktop fairy familiar
Puck is a small, mischievous creature that lives on your screen. He **roams**, **peeks** at one little patch of whatever you're doing, and **murmurs** a single in-character line about it β€” then drifts on. He's not an assistant and not a notifier. He's company: *marginally useful, reliably charming.*
> **This Space is the playable demo** (a simulated desktop in your browser). Puck's real home is a transparent always-on-top overlay on your actual Mac desktop β€” that's the video. Here you can poke him, watch him roam, peek, react, and blend into the desktop.
**▢️ Demo video:** [watch on YouTube](https://youtu.be/Jzzt_UE11jU) Β· **πŸ“£ Social post:** [on X](https://x.com/_vu/status/2066577250137587853) Β· **πŸ““ Field notes:** [read the writeup](https://huggingface.co/blog/build-small-hackathon/meet-puck)
## What he does
- **Roam β†’ peek β†’ quip.** Every so often he flutters somewhere and looks at the small region under him. A vision-language model *sees that patch and speaks in his voice* in one shot β€” the AI is doing the fun thing, not narrating from a script.
- **Feels things.** Each peek is classified into an emotion that drives his whole reaction together β€” a **gesture** (giddy laugh, NANI-confusion, a worried tremble, a wistful droop), an **aura color**, and an **emotional voice** (pitch + pace). Puck's a *learning* creature, so confusion is his honest default β€” and endearing.
- **Reads the room.** On a real terminal he uses on-device OCR to tell *which* coding agent you're running (Claude Code vs Codex vs opencode vs pi) and grounds his quip in the actual text on screen.
- **Camouflages.** Sit still and he cloaks β€” his skin clears so the desktop reads right through him, with a faint shimmer and two watching eyes. Move, and he snaps back.
- **Dreams.** At night he **blooms** the day's peeks into a small garden of memories.
## Built Small
Puck is a constellation of small, laptop-sized models β€” **nothing over 32B**, and it fits in your pocket:
- πŸ‘οΈ **Vision + voice-of-the-fairy:** **Holotron-12B** β€” H Company's computer-use VLM post-trained from NVIDIA's Nemotron-Nano-12B-VL (one model that both *sees* and *speaks* the quip).
- πŸ”€ **Tool/site recognition:** on-device **Vision OCR** + an **ONNX CLIP ViT-B/32** (~88M) fingerprinter.
- πŸ—£οΈ **Neural voice:** **Kokoro-82M**, running in-browser (WebGPU) β€” a fairy voice in 82M params.
The whole companion runs locally and offline ("Off the Grid"). *This hosted Space* uses a cloud (Modal) copy of the 12B so it works in your browser β€” which scales to zero, so **the first peek may take a moment while Puck stretches his wings.** ✨
## Under the hood
A custom React UI served by a **`gradio.Server`** daemon (Off-Brand) β€” two habitats from one app: this browser **sim**, and a **Tauri** overlay for the real desktop. Region peeks, an emotion contract, a memory log that blooms into a garden.
*HF Build Small β€” Thousand Token Wood (Creative).*