WitGym / docs /field-notes.md
akshay4's picture
Upload folder using huggingface_hub
ced0ccd verified
|
Raw
History Blame Contribute Delete
2.48 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

WitGym — Field Notes (Build Small 2026)

What I built

WitGym is a comedy coaching engine for real-life awkward moments. You paste a situation, it produces one sharp line, then lets you iterate with drills (sharpen / different angle / explain).

The core bet: comedy transfers by structure, not by topic. Instead of “RAG on jokes”, WitGym does CBR-RAG on comedy mechanics and uses precedent from The Office to ground the response.

The small-model constraint (≤32B) changed the design

Under the Build Small constraint, the goal wasn’t “generate funnier text by scaling”, it was “get reliable wit by adding structure”:

  • Pass 1 (extraction): extract a compact schema (ComedyMetadata) describing the moment: archetype, tension, violation distance, subtext, behavioral observation, etc.\n
  • Retrieval (CBR-RAG): retrieve structurally similar precedent scenes from a prebuilt index.\n
  • Pass 2 (generation): draft 2–3 persona candidates with strict constraints.\n
  • Pass 3 (ranking): pick a winner with an explicit judging rubric (truth precision + strong ending + domain anchoring).\n
  • Pass 4 (compression): optionally tighten the winner to one crisp line.\n

What was unexpectedly important

  • Behavioral observation > feelings: naming the move (“renamed procrastination as ‘keeping options open’”) is a better generative seed than therapy-language subtext.\n
  • Ranking beats clever prompting: the biggest quality jumps came from forcing a tournament-style selection rubric, especially “truth precision” and “final clause” quality.\n
  • Progressive disclosure UX matters: streaming phase updates and an expandable trace makes judges trust the system (it’s not “vibes”; you can see what it did).\n

Models used

  • LLM: Qwen/Qwen3.5-27B (≤32B) via Hugging Face Inference Providers (recommended runtime path).\n
  • Embedder: BAAI/bge-small-en-v1.5 (33M) for retrieval.\n
  • Reranker (optional): cross-encoder/ettin-reranker-32m-v1 (CPU) for pool reranking.\n

What I’d improve next (post-hackathon)

  • Make “coach mode” differentiate response style, not just “add an explanation panel”.\n
  • Add a public, privacy-safe trace export that covers both evaluation runs and real usage patterns (with de-identification).\n
  • Tighten the “small talk” and “low twist” path so it’s still delightful without running the full pipeline.\n