# ⚖️ WitnessBox — PRD

> **Cross-examine a hostile AI witness.** A courtroom interrogation game where the witness reacts
> to *how you deliver*, the AI is the irreplaceable mechanic, and a **Modal Sandbox executing
> model-written code** is the game's referee.
>
> **Track:** 🍄 Thousand Token Wood · **Primary prize:** Best Use of Modal (1st-caliber, Axis A:
> Sandbox-runs-model-generated-code) · **Status:** built, compiles clean (see existing `hf-hackathon/witnessbox/`).

## 1. Vision & why it wins
Interrogate **Marcus Reid, CFO of Halcyon Dynamics**. He's evasive and reads your **delivery
stance** (vocal confidence) — sound confident and he clams up; sound hesitant and he gets cocky
and overshares. Catch him in **3 contradictions** and his voice **cracks** as he breaks.

Three independent win mechanisms, three judge pools:
1. **Best Use of Modal (#1 target):** the core mechanic IS Modal's documented flagship pattern —
   an LLM writes code, a Sandbox safely executes it. Modal's own GRPO example: the *"Best Use of
   Modal prize showcased the use of sandboxes for securely evaluating model-generated code."* No
   rival in the field centers on this; most use Modal as plain inference hosting.
2. **OpenBMB Best MiniCPM Build (Wood):** MiniCPM-o is the *character*, VoxCPM2's style-tags are the
   *game state* — "model is the product," which beats "model is a component."
3. **Wood track podium (4 paid slots):** delight + load-bearing AI + originality + polish; a voiced,
   interactive game with a win condition and an audiovisual climax stands out vs watch-only demos.

## 2. Target prizes
Primary: **Best Use of Modal (1st)**. Secondary (awards stack): OpenBMB-Wood · Wood podium ·
Community Choice (Wood) · Nemotron Hardware (ASR) · Best Agent · Best Demo · Off-Brand *(only if a
real `gr.Server` custom UI is built — not earned by CSS alone)*.

## 3. Users & core experience
Player = anyone who wants the fantasy of breaking a witness on the stand. Turn-based push-to-talk:
```
player records a question (mic)
  → Nemotron ASR transcribes  +  librosa reads DELIVERY STANCE (perceived confidence; NOT lie detection)
  → stance steers the witness system prompt (Hesitant → he overshares a thread toward an uncaught lie)
  → ONE MiniCPM-o call returns {in-character reply, contradiction-check Python}
  → modal.Sandbox executes the MODEL-WRITTEN code; its JSON verdict DECIDES the catch
    (keyword matching is only a silent fallback; on Sandbox error, the model self-corrects its code)
  → VoxCPM2 voices the reply; style escalates with pressure
catch #3 → win; the witness's voice cracks (pre-generated best take)
```

## 4. Functional requirements
- **3 planted lies** injected into the system prompt (timeline, authorization, relationship), each
  with a concrete contradiction cue the player must surface. Detection fires against THESE, not on
  emergent model inconsistency (reliable > magical).
- **Delivery stance** from a parallel librosa pass (pause-rate + speaking-rate dominant per the
  prosody literature; pitch minor). Framed as *perceived delivery*, **never** "lie detector."
- **Stance is load-bearing:** Hesitant delivery makes the witness leak a cue toward one uncaught lie.
- **Win at 3 catches**, ≤ ~12 turns; the climactic break line is pre-generated and cached.
- The model-written code + Sandbox verdict are shown **live** in an open panel (the Modal evidence).

## 5. Technical architecture (all ≤32B; ≈12B combined)
| Component | Model / lib | Notes (verified) |
|---|---|---|
| Witness brain | `openbmb/MiniCPM-o-4_5` (9.4B) | `AutoModel`, `trust_remote_code`; `chat(msgs=, use_tts_template=False, enable_thinking=False, generate_audio=False)`; `init_vision/audio/tts=False` (text-only). |
| Witness voice | `openbmb/VoxCPM2` (2B) | `from_pretrained(load_denoiser=False)`; Voice-Design CFO once → Controllable-Clone per line `generate(text="(style)...", reference_wav_path=ref)`; 48kHz; **torch≥2.5.0**. |
| Player ASR | `nvidia/nemotron-speech-streaming-en-0.6b` (or `-3.5-asr-streaming-`) | whisper-small local fallback. |
| Delivery stance | `librosa` | parallel waveform pass; pause/rate → tier. |
| Contradiction engine | MiniCPM-o **generates** networkx code → `modal.Sandbox` | the verdict authority. |

## 6. Best Use of Modal — five load-bearing primitives (the #1-prize section)
The core mechanic is Modal's flagship Sandbox pattern (`docs/examples/agent`, `safe_code_execution`).
1. **⭐ Sandbox executes model-written code** — the game's referee (network-blocked; its JSON decides catches).
2. **🔧 Agentic self-correction** — on Sandbox error, the error feeds back to MiniCPM-o, which repairs its own code and reruns (max 2) — Modal's `devlooper` generate→execute→fix loop.
3. **GPU inference via `@app.cls`, scale-to-zero** — MiniCPM-o (A100) + VoxCPM2 (A10G) + Nemotron ASR (A10G), idle → $0.
4. **Parallel `.map()`** — pre-generates the scripted voice beats (incl. the voice-crack) at load.
5. **Memory snapshot + Volume** — snapshot cuts cold start (measured); a Volume persists the designed CFO voice clip + model cache.
**Measured cost:** quote real container-seconds → "$0.0X / match" (read from the Modal dashboard).
Map this verbatim into the README's "Best Use of Modal" section (REQ-06 requires noting Modal).

## 7. UX / UI requirements
Courtroom aesthetic (parchment, serif). CFO portrait. "Delivery Stance" bar (labeled *not a lie
detector*). X/3 contradiction counter. Autoplay witness audio. **Contradiction Engine accordion
defaults OPEN** (the #1-prize evidence must be on camera). Latency (~20–35s warm) masked diegetically
("the witness considers…"). For Off-Brand, a real `gr.Server` custom courtroom UI would be required.

## 8. Demo video (the judged artifact)
60–90s, controlled, ~20 dry runs first: stance steers witness → ask hesitantly, he overshares →
catch #1 → the Sandbox panel shows model-written code + verdict → catch #3 → **voice cracks** →
cost readout. Show the Sandbox executing the model's code as the dramatic beat.

## 9. Success metrics
Five consecutive clean end-to-end turns from the deployed Space · win-at-3 reliable · Sandbox
verdict authoritative (codegen broken <~30% of turns, self-correction covers the rest) · voice-crack
lands · measured Modal cost + snapshot seconds captured.

## 10. Risks & mitigations
- **End-to-end turn never run** (highest risk) → deploy + prove 5 turns before anything downstream.
- **Modal secrets unset** → Space boots (lookup is lazy/try-excepted) but the Sandbox is dead; set `MODAL_TOKEN_ID`/`MODAL_TOKEN_SECRET` as Space secrets.
- **Codegen unreliable** → self-correction loop + a networkx skeleton in the prompt; never show repeated `score=0.00`.
- **Voice-crack variance** → pre-generate ≥30 takes of the win line, cache the best.
- **Nemotron ASR install friction** → bounded attempt, else pivot to parakeet or whisper fallback (never blocks the critical path).

## 11. Build plan (by dependency — no calendar)
1. Set Space secrets · generate CFO portrait · (done in scaffold: lazy lookup, warmup sandbox prebuild, accordion open, torch≥2.5, generate_audio/init_audio).
2. Deploy + smoke-test `run_in_sandbox()` and the voxcpm image standalone.
3. **Five consecutive end-to-end turns** from the deployed Space + measured latencies/cost (the gate).
4. ≥30 win-line takes cached · codegen reliability hardened.
5. Nemotron ASR pivot-gate (stop-loss) · optional real `gr.Server` UI for Off-Brand.
6. Demo video (after dry runs) → README measured numbers → social → submit.

## 12. Integrity rules
Claims follow code — no "only entry that…" claims about a moving field; cost/latency are measured,
never fabricated. Pre-submit grep: `TODO | YOUR_HF_USER | NotImplementedError | <!--`.