Spaces:

build-small-hackathon
/

JudgeGPT

Sleeping

App Files Files Community

Update Judge-GPT code and README

by AliIqbal05 - opened 17 days ago

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

+2021

-470

Files changed (10) hide show

README.md +123 -55
app.py +858 -160
modal_app.py +22 -3
sovereign_bench/cases.py +157 -24
sovereign_bench/engine.py +243 -179
sovereign_bench/llm.py +92 -5
sovereign_bench/models.py +2 -0
tests/test_cases.py +9 -1
tests/test_engine.py +205 -28
tests/test_ui_rendering.py +310 -15

README.md CHANGED Viewed

@@ -9,97 +9,165 @@ app_file: app.py
 pinned: false
 license: mit
 short_description: AI-native miniature trials under 32B.
 ---
 # Judge-GPT
-Judge-GPT is a cinematic Gradio Space for the Build Small Hackathon's Thousand Token Wood track. It runs two-minute AI-native miniature trials where small-model agents act as advocates, judge, jurors, clerk, and evidence auditor.
-The app is built to stay under the 32B named-model budget:
-- `openai/gpt-oss-20b` for primary legal reasoning.
-- `openbmb/AgentCPM-Explore` for clerk/stage/verdict style.
-- `nvidia/Nemotron-Orchestrator-8B` for juror and evidence-auditor review.
-Total named budget: 32B parameters.
-## What the app can do
-- Run cached trials for the Socrates and Barnaby demo cases without network search.
-- Run the Live Search Tribunal path, which builds a search packet from a user query and stops if live material is too weak to support a trial.
-- Add a hypothetical sidebar to shift the framing of a trial without editing cached case files.
-- Switch trial pacing between swift, measured, and ceremonial speeds.
-- Stage the courtroom with phase-specific visuals, agent puppets, evidence props, captions, and browser audio cues.
-- Show the Mind Layer as a compact JSON trace of agent turns and phase metadata.
-- Call a Modal streaming endpoint when `MODAL_TRIAL_URL` is configured. Endpoint or model failures stop the trial instead of substituting cached dialogue.
-- Retain decree and agent-trace export helpers in `sovereign_bench/export.py` for future UI restoration.
-## Limitations
-- Judge-GPT is not legal advice and should not be used for real legal decisions.
-- Live search snippets are not independently verified by the app.
-- Output quality depends on Modal GPU availability, token limits, and the configured Hugging Face models.
-- Model, Modal, or live retrieval failures stop the current trial rather than returning substitute courtroom dialogue.
-- Trial results are not persisted across sessions.
-- Export generation remains in the codebase, but the visible download UI is currently hidden.
-## Run locally
 ```powershell
 python -m pip install -r requirements.txt
 python app.py
 ```
-## Modal backend
-The Gradio app works locally without Modal. If `MODAL_TRIAL_URL` is set, the Space calls the Modal streaming endpoint and stops the trial if the endpoint is unavailable.
-The deployed Modal endpoint runs each role prompt through a GPU-backed vLLM class on H100 by default. Traces mark successful GPU calls with `runtime: modal-gpu-vllm`, `provider: modal-gpu-vllm`, and `gpu: H100`. If a GPU/model load fails, the trial stops; the app does not substitute provider or cached dialogue.
 ```powershell
 python -m modal deploy modal_app.py
 ```
-Keep the deployed endpoint URL as a Hugging Face Space variable named `MODAL_TRIAL_URL`.
-## Project targets
-Workspace connected to:
-- GitHub: `https://github.com/aliiqbal24/BuildSmallfinal.git`
-- Modal profile: `ali-j-iqbal24`
-- Hugging Face user: `AliIqbal05`
-## Secrets
-Credentials are not committed to this repo.
-- Local Hugging Face CLI auth is stored in the Hugging Face cache.
-- Modal auth is stored in the local Modal profile.
-- Modal has a secret named `huggingface` with `HF_TOKEN`.
-Use the Modal secret in functions like this:
-```python
-@app.function(secrets=[modal.Secret.from_name("huggingface")])
-def run_model():
-    token = os.getenv("HF_TOKEN")
-```
-## Developer guide
-- `app.py`: Gradio UI, CSS, JavaScript audio hooks, HTML renderers, and Modal/local streaming switch.
-- `sovereign_bench/engine.py`: trial phases, agent orchestration, verdict assembly, and trace construction.
-- `sovereign_bench/llm.py`: Hugging Face calls, strict model error handling, and prompt building.
-- `sovereign_bench/retrieval.py`: live search packet construction.
-- `sovereign_bench/models.py`: Pydantic schemas for cases, evidence, events, turns, votes, and verdicts.
-- `sovereign_bench/cases.py`: cached demo case packets.
-- `sovereign_bench/export.py`: dormant decree and trace writers.
-- `modal_app.py`: Modal deployment and GPU-backed streaming endpoint.
-- `tests/`: engine, case, and rendering regression coverage.
-## Verify Modal to Hugging Face
 ```powershell
-python -m modal run modal_app.py
 ```

 pinned: false
 license: mit
 short_description: AI-native miniature trials under 32B.
+tags:
+  - track:wood
+  - sponsor:openai
+  - sponsor:nvidia
+  - sponsor:modal
+  - achievement:offbrand
+  - achievement:fieldnotes
 ---
 # Judge-GPT
+Judge-GPT is a cinematic Gradio courtroom for the Build Small Hackathon's Thousand Token Wood track. It turns a compact evidence packet into a two-minute AI-native trial: a clerk opens the docket, two lawyers argue opposite sides, Marcus Aurelius presides, six fixed-perspective jurors vote, and the court seals a verdict.
+The point is not legal advice. It is a small-model theater for structured disagreement: evidence is visible, roles are constrained, hidden reasoning is stripped, and every trial leaves a trace of which agent said what.
+## Submission Links
+- Hugging Face Space: https://huggingface.co/spaces/build-small-hackathon/JudgeGPT
+- Demo video: https://drive.google.com/drive/folders/10pWJ7NVCsnVV7wOlqm4MGWg4Kmh4rMY2?usp=sharing
+- Social post: TODO paste final public social post URL
+- GitHub repo: https://github.com/aliiqbal24/BuildSmallfinal
+- Field guide validator: https://build-small-hackathon-field-guide.hf.space/submit
+## What Judges Should Try
+1. Open the Space and keep the default `Trial of Socrates`.
+2. Click `Begin Trial`.
+3. Watch the courtroom progress from intake to verdict.
+4. Hover the judge, clerk, lawyers, and jurors to inspect model/agent threads.
+5. Open the `Evidence Drawer` and `Juror Panel` tabs after the verdict.
+6. Try `Greg Heffley vs Mom` for a lighter family-court case.
+7. Try `Custom` to write a short dispute and up to three pieces of evidence per side directly into the docket book.
+## Why It Fits Build Small
+- **Thousand Token Wood:** the app is whimsical, theatrical, and AI-native rather than a generic chatbot.
+- **Best Use of Codex:** Codex was used throughout implementation, debugging, UI iteration, tests, and commit prep in the connected GitHub repo.
+- **Nemotron Hardware Prize:** Nemotron is a core runtime model for the jury and juror vote generation.
+- **Best Use of Modal:** the Gradio Space delegates live model inference to a Modal GPU streaming endpoint.
+- **Off-Brand:** the UI pushes past stock Gradio with a custom courtroom, animated puppets, docket book, evidence props, audio cues, and verdict staging.
+- **Field Notes:** this README documents the build idea, model choices, runtime architecture, limitations, and submission checklist.
+## Small-Model Budget
+Every named model is under the 32B parameter cap.
+| Role | Model | Budgeted size | Used for |
+| --- | --- | ---: | --- |
+| Presiding advocate | `openai/gpt-oss-20b` | 20B | Judge, claimant lawyer, respondent lawyer, verdict voice |
+| Clerk of style | `openbmb/AgentCPM-Explore` | 4B | Clerk/stage voice |
+| Jury ring | `nvidia/Nemotron-Orchestrator-8B` | 8B | Jury panel and six juror votes |
+Displayed aggregate budget: 32B. The app does not use a model above 32B.
+## How It Works
+Judge-GPT runs a deterministic courtroom sequence over a `CasePacket`:
+1. Clerk opens the docket.
+2. Judge frames the dispute.
+3. Mike OSS argues for the claimant.
+4. Harvey Vector argues for the respondent.
+5. The evidence record is displayed without adding a third lawyer.
+6. The judge asks a hinge question.
+7. Each lawyer answers from their side.
+8. Nemotron Jury retires the panel.
+9. Six named jurors vote from distinct worldviews.
+10. The judge announces the final verdict.
+The shipped demo cases are:
+- `The Polis v. Socrates`
+- `Greg Heffley v. Mom`
+- `Custom`, built from the docket-book fields in the UI
+## Runtime Architecture
+- `app.py` renders the Gradio UI, courtroom HTML/CSS, audio hooks, case preview book, and live event stream.
+- `sovereign_bench/engine.py` orchestrates trial phases, model calls, evidence events, jury votes, verdict assembly, and trace metadata.
+- `sovereign_bench/llm.py` builds role prompts, calls Hugging Face-compatible chat models, and rejects hidden reasoning or instruction echoes.
+- `sovereign_bench/cases.py` contains the cached demo case packets.
+- `modal_app.py` hosts the GPU-backed streaming endpoint used by the Space.
+- `tests/` contains engine, case, and rendering regression tests.
+The Gradio app uses `MODAL_TRIAL_URL` when set, otherwise it uses the built-in deployed Modal endpoint. The Modal app owns the Hugging Face token through a Modal secret named `huggingface`; no real credentials are committed.
+## Run Locally
 ```powershell
 python -m pip install -r requirements.txt
 python app.py
 ```
+Open:
+```text
+http://127.0.0.1:7860
+```
+## Deploy Modal Backend
 ```powershell
 python -m modal deploy modal_app.py
 ```
+After deployment, pre-warm every configured courtroom model in the deployed `sovereign-bench` app so the first trial does not wait for all GPU containers to cold start. Run this after each deploy because deployments reset Modal autoscaler overrides:
+```powershell
+python -m modal run modal_app.py::warm_models
+```
+If the endpoint changes, set the Hugging Face Space variable:
+```text
+MODAL_TRIAL_URL=https://your-modal-endpoint.example
+```
+## Deploy Hugging Face Space
+Create or upload this repo as a Gradio Space inside the official Build Small org:
+```text
+build-small-hackathon/<your-space-name>
+```
+Space settings:
+- SDK: Gradio
+- App file: `app.py`
+- Python requirements: `requirements.txt`
+- Optional variable: `MODAL_TRIAL_URL`
+- No Space secret is required if using the hosted Modal endpoint.
+## Verification
+```powershell
+python -m pytest
+```
+Focused checks used during final prep:
 ```powershell
+python -m pytest tests/test_engine.py tests/test_ui_rendering.py
 ```
+## Limitations
+- Judge-GPT is not legal advice and should not be used for real legal decisions.
+- The demo packets are compact, staged evidence packets, not exhaustive source research.
+- Model, Modal, or retrieval failures stop the current trial instead of substituting fake dialogue.
+- Trial results are not persisted across sessions.
+- Custom trials require a short case context and evidence from both sides.
+## Final Submission Checklist
+- [ ] Push the repo to the Build Small Hugging Face org as a Gradio Space.
+- [ ] Confirm the Space launches and can complete `Trial of Socrates`.
+- [ ] Record a short demo video showing the trial flow and verdict.
+- [ ] Replace the `Demo video` TODO above with the final public URL.
+- [ ] Publish one social post about the app.
+- [ ] Replace the `Social post` TODO above with the final public URL.
+- [ ] Run the README through the Build Small validator.

app.py CHANGED Viewed

@@ -2,13 +2,18 @@ from __future__ import annotations
 import json
 import os
 from collections.abc import Iterable
 import gradio as gr
 import httpx
 from sovereign_bench.engine import JUDGE_NAME, JUROR_PERSONAS, stream_trial
-from sovereign_bench.models import TrialEvent, TrialRequest
 def _load_env_file() -> None:
@@ -28,10 +33,16 @@ _load_env_file()
 CASE_OPTIONS = {
     "Trial of Socrates": "socrates",
-    "The People v. Barnaby Buttons": "barnaby",
-    "Live Search Tribunal": "live",
 }
 PHASE_GLYPHS = {
     "pretrial": "00",
     "intake": "01",
@@ -44,6 +55,24 @@ PHASE_GLYPHS = {
     "appeal": "08",
 }
 AUDIO_PATHS = {
     "score": "/gradio_api/file=assets/audio/courtroom.ogg",
     "judgement": "/gradio_api/file=assets/audio/Judgement.ogg",
@@ -102,9 +131,9 @@ body,
 .docket-book-controls {
   position: fixed;
   left: 50%;
-  top: clamp(172px, 21vh, 212px);
   z-index: 9999;
-  width: min(620px, calc(100vw - 160px));
   max-width: none;
   margin: 0;
   padding: 0;
@@ -202,21 +231,6 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
   line-height: 1.25;
 }
-.trial-options {
-  max-width: 1120px;
-  margin: 0 auto 14px;
-  border: 1px solid rgba(255, 226, 154, .18);
-  border-radius: 6px;
-  background: rgba(18, 9, 5, .78);
-  color: #f5dfb5;
-}
-.trial-options label,
-.trial-options span,
-.trial-options .prose {
-  color: #f5dfb5 !important;
-}
 .court-episode-stage {
   --spot-x: 50%;
   --spot-y: 36%;
@@ -250,6 +264,70 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
   z-index: 4;
 }
 .episode-room {
   position: absolute;
   inset: 0;
@@ -388,9 +466,9 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
 .episode-book {
   position: absolute;
   left: 50%;
-  top: 12%;
-  z-index: 12;
-  width: min(760px, calc(100% - 32px));
   aspect-ratio: 3 / 2;
   transform: translateX(-50%) rotateX(0) rotateZ(-1deg);
   transform-origin: center bottom;
@@ -400,6 +478,10 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
   transition: top .85s ease, width .85s ease, transform .85s ease, filter .85s ease, opacity .85s ease;
 }
 .book-art {
   position: absolute;
   inset: 0;
@@ -416,8 +498,8 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
 }
 .episode-book.closed {
-  top: 36%;
-  width: min(245px, 30vw);
   transform: translateX(-50%) rotateX(56deg) rotateZ(1deg);
   opacity: .92;
   filter: drop-shadow(0 18px 18px rgba(0, 0, 0, .45));
@@ -438,35 +520,91 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
 .book-open-content {
   position: absolute;
-  inset: 17% 10% 13%;
   z-index: 2;
   display: grid;
   grid-template-columns: 1fr 1fr;
-  gap: 72px;
-  padding: 0 28px;
   transition: opacity .35s ease;
 }
 .book-open-content h2 {
-  margin: 0 0 10px;
   color: #4c2a12;
-  font-size: 30px;
   letter-spacing: 0;
 }
 .book-open-content p,
 .book-entry {
   color: #3c2615;
-  font-size: 15px;
-  line-height: 1.34;
 }
 .book-entry {
-  margin: 11px 0;
   padding-left: 12px;
   border-left: 3px solid rgba(111, 61, 23, .36);
 }
 .judge-dais {
   position: absolute;
   left: 50%;
@@ -536,11 +674,11 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
 }
 .jury-benches.left {
-  left: 4.5%;
 }
 .jury-benches.right {
-  right: 4.5%;
 }
 .jury-benches.left .jury-row {
@@ -594,7 +732,7 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
 }
 .foreground-fence {
-  bottom: -1.5%;
   width: 47%;
 }
@@ -610,9 +748,9 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
 .judge-table-foreground {
   left: 50%;
-  top: 35%;
   z-index: 1;
-  width: 46%;
   transform: translateX(-50%);
 }
@@ -650,7 +788,7 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
 .puppet.judge {
   left: 50%;
-  top: 31%;
   --skin: #c38a55;
   --robe: #1b1b20;
   --accent: #79242a;
@@ -660,7 +798,8 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
 .puppet.clerk {
   left: 43%;
-  top: 41%;
   --skin: #b77b52;
   --robe: #365548;
   --accent: #2f6f5e;
@@ -668,7 +807,7 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
 .puppet.auric {
   left: 24%;
-  top: 62%;
   --skin: #c9975d;
   --robe: #5b2719;
   --accent: #a45c25;
@@ -676,28 +815,20 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
 .speaker-auric .puppet.auric {
   left: 43%;
-  top: 66%;
 }
 .puppet.sable {
   left: 75%;
-  top: 62%;
   --skin: #a86d4a;
   --robe: #1d3045;
   --accent: #254f7a;
 }
 .speaker-sable .puppet.sable {
-  left: 57%;
-  top: 66%;
-}
-.puppet.auditor {
-  left: 71%;
-  top: 55%;
-  --skin: #c6a65b;
-  --robe: #4b3d1b;
-  --accent: #8d6b1f;
 }
 .puppet-portrait {
@@ -713,10 +844,6 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
   pointer-events: none;
 }
-.phase-evidence .puppet.auditor {
-  animation: evidence-focus 1.35s ease-in-out infinite;
-}
 .puppet::before {
   content: "";
   position: absolute;
@@ -749,6 +876,11 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
     linear-gradient(180deg, var(--accent), var(--robe) 52%, #130a07);
 }
 .puppet .mouth {
   position: absolute;
   left: 50%;
@@ -761,42 +893,169 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
   border-radius: 0 0 18px 18px;
 }
 .puppet.active .mouth,
 .puppet.walking .mouth {
   animation: speak-mouth .5s ease-in-out infinite;
 }
-.speech-bubble {
   position: absolute;
   left: 50%;
-  bottom: calc(100% + 12px);
-  z-index: 18;
-  width: 260px;
-  max-width: min(320px, calc(100vw - 32px));
-  transform: translateX(-50%);
-  padding: 10px 12px;
-  border: 1px solid rgba(255, 226, 154, .48);
-  border-radius: 6px;
-  background: rgba(255, 244, 215, .94);
-  color: #2d1b0d;
-  box-shadow: 0 14px 30px rgba(0, 0, 0, .34);
   font-size: 12px;
-  font-weight: 700;
-  line-height: 1.3;
   pointer-events: none;
 }
-.speech-bubble::after {
   content: "";
   position: absolute;
-  left: 50%;
-  bottom: -8px;
-  width: 14px;
-  height: 14px;
   transform: translateX(-50%) rotate(45deg);
-  border-right: 1px solid rgba(255, 226, 154, .48);
-  border-bottom: 1px solid rgba(255, 226, 154, .48);
-  background: rgba(255, 244, 215, .94);
 }
 .tooltip {
@@ -931,11 +1190,6 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
   animation: juror-react .82s ease-in-out infinite alternate;
 }
-.juror .speech-bubble {
-  bottom: calc(100% + 6px);
-  width: 230px;
-}
 .juror-face {
   position: absolute;
   left: 50%;
@@ -1195,14 +1449,43 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
   100% { transform: rotate(-18deg) translateY(0); }
 }
 @media (max-width: 820px) {
   .docket-book-controls {
     position: fixed;
-    top: 262px;
     width: calc(100vw - 52px);
     transform: translateX(-50%) rotate(-1deg);
   }
   .court-episode-stage {
     height: 1280px;
     min-height: 1280px;
@@ -1225,21 +1508,64 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
     max-width: calc(100% - 32px);
   }
   .episode-book {
-    top: 220px;
     width: min(680px, calc(100% - 20px));
   }
   .episode-book.closed {
-    top: 430px;
-    width: 210px;
   }
   .book-open-content {
     grid-template-columns: 1fr;
     gap: 10px;
-    inset: 17% 12% 14%;
-    padding: 0 18px;
   }
   .book-open-content h2 {
@@ -1257,6 +1583,25 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
     margin: 5px 0;
   }
   .judge-dais {
     top: 390px;
     width: 280px;
@@ -1278,32 +1623,27 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
   .puppet.auric {
     left: 20%;
-    top: 650px;
   }
   .puppet.sable {
     left: 80%;
-    top: 650px;
   }
   .speaker-auric .puppet.auric {
     left: 42%;
-    top: 730px;
   }
   .speaker-sable .puppet.sable {
-    left: 58%;
-    top: 730px;
   }
   .puppet.clerk {
     left: 35%;
-    top: 560px;
-  }
-  .puppet.auditor {
-    left: 78%;
-    top: 540px;
   }
   .witness-area {
@@ -1319,15 +1659,15 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
   }
   .jury-benches.left {
-    left: 5%;
   }
   .jury-benches.right {
-    right: 5%;
   }
   .foreground-fence {
-    bottom: -2px;
     width: 64%;
   }
@@ -1340,8 +1680,8 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
   }
   .judge-table-foreground {
-    top: 405px;
-    width: 760px;
   }
   .evidence-props {
@@ -1530,12 +1870,28 @@ APP_HEAD = f"""
 """
 START_JS = """
-(case_label, search_query, hypothetical, speed, mind_layer) => {
   document.body.classList.add('trial-has-started');
   if (window.SovereignCourtAudio) {
     window.SovereignCourtAudio.begin();
   }
-  return [case_label, search_query, hypothetical, speed, mind_layer];
 }
 """
@@ -1553,24 +1909,18 @@ CHARACTERS = {
         "role": "Court clerk",
         "model": "AgentCPM-Explore",
     },
-    "Advocate Auric": {
         "class": "auric",
-        "name": "Advocate Auric",
         "role": "Claimant advocate",
         "model": "gpt-oss-20b",
     },
-    "Counsel Sable": {
         "class": "sable",
-        "name": "Counsel Sable",
         "role": "Respondent advocate",
         "model": "gpt-oss-20b",
     },
-    "Auditor Prism": {
-        "class": "auditor",
-        "name": "Auditor Prism",
-        "role": "Evidence auditor",
-        "model": "Nemotron-Orchestrator-8B",
-    },
     "Nemotron Jury": {
         "class": "jury",
         "name": "Nemotron Jury",
@@ -1597,13 +1947,37 @@ JUROR_IMAGES = {
     "Jensen Huang": "/gradio_api/file=assets/characters/jensen-huang.png",
 }
 PHASE_AGENTS = {
     "pretrial": ["Clerk Meridian"],
 }
 def _remote_events(request: TrialRequest) -> Iterable[TrialEvent] | None:
-    endpoint = os.getenv("MODAL_TRIAL_URL", "").strip()
     if not endpoint:
         return None
@@ -1617,13 +1991,13 @@ def _remote_events(request: TrialRequest) -> Iterable[TrialEvent] | None:
     return iterator()
-def get_events(request: TrialRequest) -> Iterable[TrialEvent]:
     remote = _remote_events(request)
     if remote is not None:
         yield from remote
         return
-    delay = {"swift": 1.4, "measured": 2.4, "ceremonial": 3.4}[request.speed]
-    yield from stream_trial(request, delay=delay)
 def _escape(value: str) -> str:
@@ -1663,6 +2037,26 @@ def _active_speaker_for(event: TrialEvent | None) -> str:
     return event.turns[0].agent
 def _speaker_class_for(speaker: str) -> str:
     if not speaker:
         return ""
@@ -1680,6 +2074,61 @@ def _latest_turn_text(event: TrialEvent | None, agent: str) -> str:
     return _short_text(turn.content, 210)
 def _thread_id(name: str) -> str:
     return "ai-thread-" + "".join(ch.lower() if ch.isalnum() else "-" for ch in name).strip("-")
@@ -1767,17 +2216,51 @@ def _thread_modal(name: str, role: str, model: str, turns: list[dict[str, str]])
     )
 def _puppet(agent: str, active_agents: set[str], phase: str, events: list[TrialEvent], latest: TrialEvent | None) -> str:
     meta = CHARACTERS[agent]
     active = " active" if agent in active_agents else ""
-    walking = " walking" if agent in {"Advocate Auric", "Counsel Sable"} and agent in active_agents else ""
-    small = " small" if agent in {"Clerk Meridian", "Auditor Prism"} else ""
     turns = _thread_for_character(events, agent)
-    bubble = ""
-    if agent in active_agents:
-        speech = _latest_turn_text(latest, agent)
-        if speech:
-            bubble = f"<span class='speech-bubble'>{_escape(speech)}</span>"
     portrait = ""
     if meta.get("image"):
         portrait = (
@@ -1788,7 +2271,6 @@ def _puppet(agent: str, active_agents: set[str], phase: str, events: list[TrialE
         f"<a class='puppet {meta['class']}{active}{walking}{small}' href='#{_escape(_thread_id(agent))}' aria-label='Open {_escape(agent)} model thread'>"
         f"{portrait}"
         "<span class='mouth'></span>"
-        f"{bubble}"
         f"{_tooltip(meta['name'], meta['role'], meta['model'], turns)}"
         "</a>"
     )
@@ -1799,36 +2281,103 @@ def _juror(name: str, active: bool, events: list[TrialEvent] | None = None, late
     image = JUROR_IMAGES.get(name, "")
     active_cls = " active" if active else ""
     turns = _thread_for_character(events or [], name)
-    bubble = ""
-    if active:
-        vote = next((vote for vote in (latest.votes if latest else []) if vote.juror == name), None)
-        speech = _latest_turn_text(latest, name)
-        if vote:
-            speech = f"{vote.vote.replace('_', ' ').title()}. {vote.reason}"
-        if speech:
-            bubble = f"<span class='speech-bubble'>{_escape(_short_text(speech, 190))}</span>"
     portrait = (
         f"<img class='juror-portrait' src='{_escape(image)}' alt='{_escape(name)} bust' "
         "onerror=\"this.style.display='none'\">"
         if image
         else ""
     )
     return (
         f"<a class='juror{active_cls}' href='#{_escape(_thread_id(name))}' style='--face: {face}' aria-label='Open {_escape(name)} model thread'>"
         f"{portrait}"
-        "<span class='juror-face'></span><span class='juror-body'></span>"
-        f"{bubble}"
         f"{_tooltip(name, 'HF-style juror', 'Nemotron panel', turns)}"
         "</a>"
     )
-def _book(open_book: bool) -> str:
     closed = "" if open_book else " closed"
     return (
-        f"<div class='episode-book{closed}'>"
         "<img class='book-art open-art' src='/gradio_api/file=assets/book/docket-book-open.png' alt='Open docket book'>"
         "<img class='book-art closed-art' src='/gradio_api/file=assets/book/docket-book-closed.png' alt='Closed docket book'>"
         "</div>"
     )
@@ -1871,6 +2420,36 @@ def _foreground_props() -> str:
     )
 def _courtroom_juror_names(votes: list) -> list[str]:
     names = list(JUROR_FACES)
     names.extend(vote.juror for vote in votes if vote.juror not in names)
@@ -1887,12 +2466,20 @@ def _latest_votes(events: list[TrialEvent]) -> list:
     return ordered
-def render_court(events: list[TrialEvent], started: bool = False) -> str:
     latest = events[-1] if events else None
     phase = latest.phase if latest else "pretrial"
     title, subtitle = _latest_packet_title(events)
-    active_agents = _active_agents_for(latest)
-    active_speaker = _active_speaker_for(latest)
     speaker_cls = _speaker_class_for(active_speaker)
     caption_phase, caption_title, caption_body = _caption(latest, phase)
     latest_votes = _latest_votes(events)
@@ -1901,7 +2488,7 @@ def render_court(events: list[TrialEvent], started: bool = False) -> str:
     book_open = not started and not events
     puppets = "".join(
         _puppet(agent, active_agents, phase, events, latest)
-        for agent in [JUDGE_NAME, "Clerk Meridian", "Advocate Auric", "Counsel Sable", "Auditor Prism"]
     )
     left_jurors = "".join(_juror(name, name == active_speaker, events, latest) for name in juror_names[:3])
     right_jurors = "".join(_juror(name, name == active_speaker, events, latest) for name in juror_names[3:6])
@@ -1915,6 +2502,7 @@ def render_court(events: list[TrialEvent], started: bool = False) -> str:
     )
     return (
         f"<section id='court-stage' class='court-episode-stage phase-{_escape(phase)}{_escape(speaker_cls)}{started_cls}' data-phase='{_escape(phase)}'>"
         "<div class='episode-room'></div>"
         "<div class='audio-deck' aria-hidden='true'>"
         + "".join(f"<audio preload='auto' src='{_escape(src)}'></audio>" for src in AUDIO_PATHS.values())
@@ -1926,7 +2514,7 @@ def render_court(events: list[TrialEvent], started: bool = False) -> str:
         f"<h1>{_escape(title)}</h1>"
         f"<p>{_escape(subtitle)}</p></div>"
         f"<div class='decree-ribbon'>Step {len(events) if events else 0}: {caption_title}<br>Hover characters for agent and model details</div>"
-        f"{_book(book_open)}"
         f"<div class='judge-dais'><div class='prop-label'>{_escape(JUDGE_NAME)}</div><div class='bench-front'></div><span class='gavel'></span></div>"
         "<div class='counsel-table left'><div class='prop-label'>Claimant Table</div></div>"
         "<div class='counsel-table right'><div class='prop-label'>Respondent Table</div></div>"
@@ -1939,6 +2527,8 @@ def render_court(events: list[TrialEvent], started: bool = False) -> str:
         f"{puppets}"
         f"{evidence_props}"
         f"{_foreground_props()}"
         "<div class='gallery-benches'><div></div><div></div><div></div><div></div><div></div><div></div></div>"
         "<div class='trial-caption'>"
         f"<div class='caption-phase'>Live Trial Feed / {_escape(caption_phase)}</div>"
@@ -2009,33 +2599,137 @@ def render_mind(events: list[TrialEvent], enabled: bool) -> str:
     return f"<pre class='mind-text'>{_escape(json.dumps(compact, indent=2))}</pre>"
-def run_ui(case_label: str, search_query: str, hypothetical: str, speed: str, mind_layer: bool):
     request = TrialRequest(
-        case_id=CASE_OPTIONS.get(case_label, "socrates"),
         search_query=search_query or "",
         hypothetical=hypothetical or "",
         speed=speed or "swift",
         mind_layer=bool(mind_layer),
     )
     events: list[TrialEvent] = []
     yield (
-        render_court(events, started=True),
         render_evidence(events),
         render_jurors(events),
         render_mind(events, mind_layer),
-        "The docket closes and the bailiff calls the room to order.",
     )
     try:
-        for event in get_events(request):
             events.append(event)
-            status = f"Step {len(events)}: {event.title}"
             yield (
                 render_court(events, started=True),
                 render_evidence(events),
                 render_jurors(events),
                 render_mind(events, mind_layer),
-                status,
             )
     except Exception as exc:
         yield (
             render_court(events, started=True),
@@ -2046,7 +2740,7 @@ def run_ui(case_label: str, search_query: str, hypothetical: str, speed: str, mi
         )
         return
     yield (
-        render_court(events, started=True),
         render_evidence(events),
         render_jurors(events),
         render_mind(events, mind_layer),
@@ -2067,13 +2761,12 @@ def build_app() -> gr.Blocks:
                 )
                 start = gr.Button("Begin Trial", variant="primary", scale=1)
             status = gr.Markdown("Ready.", elem_classes=["book-status"])
-        courtroom = gr.HTML(render_court([]), label="Live courtroom")
         search = gr.State("")
         speed = gr.State("swift")
         mind = gr.State(True)
-        with gr.Accordion("Advanced trial options", open=False, elem_classes=["trial-options"]):
-            with gr.Row():
-                hypo = gr.Textbox(label="Hypothetical sidebar", lines=1)
         with gr.Row(elem_classes=["drawer-shell"]):
             with gr.Column(scale=1):
                 with gr.Tab("Evidence Drawer"):
@@ -2081,9 +2774,14 @@ def build_app() -> gr.Blocks:
                 with gr.Tab("Juror Panel"):
                     jurors = gr.HTML(render_jurors([]))
                 mind_html = gr.HTML(render_mind([], True), visible=False)
         start.click(
             run_ui,
-            inputs=[case, search, hypo, speed, mind],
             outputs=[courtroom, evidence, jurors, mind_html, status],
             js=START_JS,
         )

 import json
 import os
+import queue
+import threading
+import time
 from collections.abc import Iterable
+from dataclasses import dataclass
 import gradio as gr
 import httpx
 from sovereign_bench.engine import JUDGE_NAME, JUROR_PERSONAS, stream_trial
+from sovereign_bench.cases import CASES, get_case
+from sovereign_bench.models import CasePacket, EvidenceItem, TrialEvent, TrialRequest
 def _load_env_file() -> None:
 CASE_OPTIONS = {
     "Trial of Socrates": "socrates",
+    "Greg Heffley vs Mom": "greg",
+    "Custom": "custom",
 }
+DEFAULT_MODAL_TRIAL_URL = "https://ali-j-iqbal24--trial-stream.modal.run"
+MIN_READ_SECONDS = 2.2
+WORDS_PER_SECOND = 3.2
+READ_BUFFER_SECONDS = 0.8
+MAX_READ_SECONDS = 8.5
 PHASE_GLYPHS = {
     "pretrial": "00",
     "intake": "01",
     "appeal": "08",
 }
+TRIAL_PROGRESS_STAGES = (
+    ("pretrial", "Pretrial"),
+    ("intake", "Intake"),
+    ("claims", "Claims"),
+    ("opening", "Opening"),
+    ("evidence", "Evidence"),
+    ("questions", "Questions"),
+    ("deliberation", "Deliberation"),
+    ("verdict", "Verdict"),
+)
+VERDICT_LABELS = {
+    "liable": "Guilty",
+    "not_liable": "Not Guilty",
+    "mixed": "Mixed",
+    "uncertain": "Uncertain",
+}
 AUDIO_PATHS = {
     "score": "/gradio_api/file=assets/audio/courtroom.ogg",
     "judgement": "/gradio_api/file=assets/audio/Judgement.ogg",
 .docket-book-controls {
   position: fixed;
   left: 50%;
+  top: 72px;
   z-index: 9999;
+  width: min(760px, calc(100vw - 160px));
   max-width: none;
   margin: 0;
   padding: 0;
   line-height: 1.25;
 }
 .court-episode-stage {
   --spot-x: 50%;
   --spot-y: 36%;
   z-index: 4;
 }
+.trial-progress {
+  position: fixed;
+  top: 0;
+  left: 0;
+  right: 0;
+  z-index: 70;
+  display: grid;
+  grid-template-columns: repeat(8, minmax(0, 1fr));
+  gap: 1px;
+  padding: 3px clamp(10px, 2vw, 24px) 4px;
+  border-bottom: 1px solid rgba(217, 176, 96, .2);
+  background: rgba(23, 13, 8, .58);
+  backdrop-filter: blur(8px);
+  box-shadow: 0 8px 18px rgba(8, 4, 2, .22);
+  pointer-events: none;
+}
+.trial-progress-segment {
+  position: relative;
+  min-width: 0;
+  padding-top: 5px;
+  overflow: hidden;
+  color: rgba(244, 213, 143, .38);
+  font: 800 10px/1 ui-monospace, SFMono-Regular, Consolas, monospace;
+  letter-spacing: .04em;
+  text-align: center;
+  text-transform: uppercase;
+  white-space: nowrap;
+}
+.trial-progress-segment::before {
+  content: "";
+  position: absolute;
+  left: 3px;
+  right: 3px;
+  top: 0;
+  height: 2px;
+  border-radius: 999px;
+  background: rgba(217, 176, 96, .18);
+}
+.trial-progress-segment.complete {
+  color: rgba(217, 176, 96, .68);
+}
+.trial-progress-segment.complete::before {
+  background: rgba(217, 176, 96, .48);
+}
+.trial-progress-segment.current {
+  color: #ffe6a6;
+  text-shadow: 0 0 10px rgba(255, 211, 116, .52);
+}
+.trial-progress-segment.current::before {
+  height: 3px;
+  background: #ffd675;
+  box-shadow: 0 0 12px rgba(255, 214, 117, .68);
+}
+.trial-progress-abbrev {
+  display: none;
+}
 .episode-room {
   position: absolute;
   inset: 0;
 .episode-book {
   position: absolute;
   left: 50%;
+  top: 122px;
+  z-index: 14;
+  width: min(980px, calc(100% - 32px));
   aspect-ratio: 3 / 2;
   transform: translateX(-50%) rotateX(0) rotateZ(-1deg);
   transform-origin: center bottom;
   transition: top .85s ease, width .85s ease, transform .85s ease, filter .85s ease, opacity .85s ease;
 }
+.episode-book.custom-book {
+  pointer-events: auto;
+}
 .book-art {
   position: absolute;
   inset: 0;
 }
 .episode-book.closed {
+  top: 50%;
+  width: min(163px, 20vw);
   transform: translateX(-50%) rotateX(56deg) rotateZ(1deg);
   opacity: .92;
   filter: drop-shadow(0 18px 18px rgba(0, 0, 0, .45));
 .book-open-content {
   position: absolute;
+  inset: 15% 9% 12%;
   z-index: 2;
   display: grid;
   grid-template-columns: 1fr 1fr;
+  gap: 82px;
+  padding: 0 20px;
   transition: opacity .35s ease;
 }
 .book-open-content h2 {
+  margin: 0 0 8px;
   color: #4c2a12;
+  font-size: 28px;
   letter-spacing: 0;
 }
 .book-open-content p,
 .book-entry {
   color: #3c2615;
+  font-size: 14px;
+  line-height: 1.28;
 }
 .book-entry {
+  margin: 8px 0;
   padding-left: 12px;
   border-left: 3px solid rgba(111, 61, 23, .36);
 }
+.book-context {
+  margin-top: 8px;
+}
+.book-case-title {
+  margin: 0 0 6px;
+  color: #4c2a12;
+  font-weight: 800;
+}
+.book-evidence-columns {
+  display: grid;
+  grid-template-columns: 1fr 1fr;
+  gap: 12px;
+}
+.book-evidence-column h3 {
+  margin: 0 0 6px;
+  color: #4c2a12;
+  font-size: 15px;
+  line-height: 1.12;
+}
+.book-evidence-list {
+  margin: 0;
+  padding: 0;
+  list-style: none;
+}
+.book-evidence-list li {
+  margin: 0 0 6px;
+  padding-left: 9px;
+  border-left: 2px solid rgba(111, 61, 23, .32);
+  color: #3c2615;
+  font-size: 12px;
+  line-height: 1.2;
+}
+.book-field {
+  width: 100%;
+  min-height: 42px;
+  resize: none;
+  border: 1px solid rgba(90, 50, 20, .34);
+  border-radius: 4px;
+  background: rgba(255, 247, 224, .7);
+  color: #2b1b10;
+  font: 12px/1.22 Georgia, "Times New Roman", serif;
+  box-shadow: inset 0 1px 2px rgba(59, 29, 10, .16);
+  pointer-events: auto;
+}
+.book-context-field {
+  min-height: 138px;
+  font-size: 13px;
+}
 .judge-dais {
   position: absolute;
   left: 50%;
 }
 .jury-benches.left {
+  left: 1%;
 }
 .jury-benches.right {
+  right: 1%;
 }
 .jury-benches.left .jury-row {
 }
 .foreground-fence {
+  bottom: -6.5%;
   width: 47%;
 }
 .judge-table-foreground {
   left: 50%;
+  top: 20%;
   z-index: 1;
+  width: 39.1%;
   transform: translateX(-50%);
 }
 .puppet.judge {
   left: 50%;
+  top: calc(40% + 156px);
   --skin: #c38a55;
   --robe: #1b1b20;
   --accent: #79242a;
 .puppet.clerk {
   left: 43%;
+  top: 66%;
+  z-index: 14;
   --skin: #b77b52;
   --robe: #365548;
   --accent: #2f6f5e;
 .puppet.auric {
   left: 24%;
+  top: 87%;
   --skin: #c9975d;
   --robe: #5b2719;
   --accent: #a45c25;
 .speaker-auric .puppet.auric {
   left: 43%;
+  top: 87%;
 }
 .puppet.sable {
   left: 75%;
+  top: 87%;
   --skin: #a86d4a;
   --robe: #1d3045;
   --accent: #254f7a;
 }
 .speaker-sable .puppet.sable {
+  left: 75%;
+  top: 87%;
 }
 .puppet-portrait {
   pointer-events: none;
 }
 .puppet::before {
   content: "";
   position: absolute;
     linear-gradient(180deg, var(--accent), var(--robe) 52%, #130a07);
 }
+.puppet.judge::before,
+.puppet.judge::after {
+  display: none;
+}
 .puppet .mouth {
   position: absolute;
   left: 50%;
   border-radius: 0 0 18px 18px;
 }
+.puppet.judge .mouth {
+  display: none;
+}
 .puppet.active .mouth,
 .puppet.walking .mouth {
   animation: speak-mouth .5s ease-in-out infinite;
 }
+.speech-bubble.active-dialogue {
   position: absolute;
   left: 50%;
+  top: 43%;
+  bottom: auto;
+  z-index: 30;
+  width: min(500px, calc(100vw - 44px));
+  max-height: 34vh;
+  overflow: visible;
+  transform: translate(-50%, -100%);
+  padding: 10px 13px 11px;
+  border: 2px solid #141413;
+  border-radius: 20px;
+  background: rgba(255, 253, 247, .97);
+  color: #141413 !important;
+  box-shadow: 0 12px 24px rgba(0, 0, 0, .32);
   font-size: 12px;
+  font-weight: 650;
+  line-height: 1.32;
   pointer-events: none;
 }
+.speech-bubble.active-dialogue,
+.speech-bubble.active-dialogue * {
+  color: #141413 !important;
+}
+.speech-bubble.active-dialogue::before,
+.speech-bubble.active-dialogue::after {
   content: "";
   position: absolute;
+  left: var(--bubble-tail-x, 50%);
+  display: block;
   transform: translateX(-50%) rotate(45deg);
+}
+.speech-bubble.active-dialogue::before {
+  bottom: -13px;
+  width: 22px;
+  height: 22px;
+  background: #141413;
+  border-radius: 0 0 5px 0;
+}
+.speech-bubble.active-dialogue::after {
+  bottom: -9px;
+  width: 16px;
+  height: 16px;
+  transform: translateX(-50%) rotate(45deg);
+  background: rgba(255, 253, 247, .97);
+  border-radius: 0 0 3px 0;
+}
+.speech-bubble.active-dialogue.pending {
+  opacity: .82;
+}
+.dialogue-meta {
+  display: flex;
+  align-items: baseline;
+  gap: 6px;
+  margin-bottom: 5px;
+  font: 800 9px/1.2 ui-monospace, SFMono-Regular, Consolas, monospace;
+  text-transform: uppercase;
+}
+.dialogue-meta strong {
+  font-size: 10px;
+}
+.dialogue-text {
+  max-height: calc(34vh - 42px);
+  overflow: auto;
+  white-space: pre-wrap;
+}
+.speech-bubble.active-dialogue.speaker-clerk { left: 43%; top: 62%; }
+.speech-bubble.active-dialogue.speaker-judge { left: 50%; top: 43%; }
+.speech-bubble.active-dialogue.speaker-auric { left: 43%; top: 78%; }
+.speech-bubble.active-dialogue.speaker-sable { left: 75%; top: 78%; }
+.speech-bubble.active-dialogue.juror-dialogue { left: 50%; top: 57%; }
+.speech-bubble.active-dialogue.juror-dialogue {
+  top: 42%;
+  width: min(340px, calc(50vw - 24px));
+}
+.speech-bubble.active-dialogue.speaker-karl-marx,
+.speech-bubble.active-dialogue.speaker-john-stuart-mill,
+.speech-bubble.active-dialogue.speaker-confucius {
+  left: 1.5%;
+  transform: translateY(-100%);
+}
+.speech-bubble.active-dialogue.speaker-cleopatra-vii,
+.speech-bubble.active-dialogue.speaker-niccolo-machiavelli,
+.speech-bubble.active-dialogue.speaker-jensen-huang {
+  right: 1.5%;
+  left: auto;
+  transform: translateY(-100%);
+}
+.speech-bubble.active-dialogue.speaker-karl-marx,
+.speech-bubble.active-dialogue.speaker-cleopatra-vii {
+  --bubble-tail-x: 19%;
+}
+.speech-bubble.active-dialogue.speaker-john-stuart-mill,
+.speech-bubble.active-dialogue.speaker-niccolo-machiavelli {
+  --bubble-tail-x: 50%;
+}
+.speech-bubble.active-dialogue.speaker-confucius,
+.speech-bubble.active-dialogue.speaker-jensen-huang {
+  --bubble-tail-x: 81%;
+}
+.verdict-popup {
+  position: absolute;
+  left: 50%;
+  top: 54%;
+  z-index: 42;
+  width: min(460px, calc(100vw - 44px));
+  transform: translate(-50%, -50%);
+  padding: 18px 22px 20px;
+  border: 2px solid rgba(255, 235, 178, .94);
+  border-radius: 8px;
+  background: rgba(20, 12, 7, .95);
+  color: #fff4d6;
+  text-align: center;
+  box-shadow: 0 28px 58px rgba(0, 0, 0, .5);
+  animation: verdict-pop .34s ease-out both;
+}
+.verdict-popup-kicker {
+  display: block;
+  margin-bottom: 7px;
+  color: #d9b060;
+  font: 800 11px/1 ui-monospace, SFMono-Regular, Consolas, monospace;
+  letter-spacing: 0;
+  text-transform: uppercase;
+}
+.verdict-popup-finding {
+  display: block;
+  color: #fff8e6;
+  font: 900 clamp(28px, 5vw, 48px)/1.02 Georgia, serif;
+}
+.verdict-popup-decree {
+  margin: 10px auto 0;
+  max-width: 38ch;
+  color: rgba(255, 244, 214, .86);
+  font-size: 13px;
+  line-height: 1.35;
 }
 .tooltip {
   animation: juror-react .82s ease-in-out infinite alternate;
 }
 .juror-face {
   position: absolute;
   left: 50%;
   100% { transform: rotate(-18deg) translateY(0); }
 }
+@keyframes verdict-pop {
+  0% {
+    opacity: 0;
+    transform: translate(-50%, -46%) scale(.94);
+  }
+  100% {
+    opacity: 1;
+    transform: translate(-50%, -50%) scale(1);
+  }
+}
 @media (max-width: 820px) {
   .docket-book-controls {
     position: fixed;
+    top: 130px;
     width: calc(100vw - 52px);
     transform: translateX(-50%) rotate(-1deg);
   }
+  .trial-progress {
+    grid-template-columns: repeat(8, minmax(24px, 1fr));
+    padding: 2px 5px 3px;
+  }
+  .trial-progress-segment {
+    font-size: 9px;
+    letter-spacing: 0;
+  }
+  .trial-progress-label {
+    display: none;
+  }
+  .trial-progress-abbrev {
+    display: inline;
+  }
   .court-episode-stage {
     height: 1280px;
     min-height: 1280px;
     max-width: calc(100% - 32px);
   }
+  .speech-bubble.active-dialogue,
+  .speech-bubble.active-dialogue.speaker-clerk,
+  .speech-bubble.active-dialogue.speaker-judge,
+  .speech-bubble.active-dialogue.speaker-auric,
+  .speech-bubble.active-dialogue.speaker-sable,
+  .speech-bubble.active-dialogue.juror-dialogue {
+    left: 50%;
+    top: 218px;
+    width: calc(100% - 28px);
+    max-height: 260px;
+    transform: translateX(-50%);
+  }
+  .speech-bubble.active-dialogue::after {
+    display: none;
+  }
+  .speech-bubble.active-dialogue.juror-dialogue,
+  .speech-bubble.active-dialogue.speaker-karl-marx,
+  .speech-bubble.active-dialogue.speaker-john-stuart-mill,
+  .speech-bubble.active-dialogue.speaker-confucius,
+  .speech-bubble.active-dialogue.speaker-cleopatra-vii,
+  .speech-bubble.active-dialogue.speaker-niccolo-machiavelli,
+  .speech-bubble.active-dialogue.speaker-jensen-huang {
+    top: 500px;
+    width: min(320px, calc(100vw - 28px));
+    transform: translateY(-100%);
+  }
+  .speech-bubble.active-dialogue.speaker-karl-marx,
+  .speech-bubble.active-dialogue.speaker-john-stuart-mill,
+  .speech-bubble.active-dialogue.speaker-confucius {
+    left: 14px;
+    right: auto;
+  }
+  .speech-bubble.active-dialogue.speaker-cleopatra-vii,
+  .speech-bubble.active-dialogue.speaker-niccolo-machiavelli,
+  .speech-bubble.active-dialogue.speaker-jensen-huang {
+    right: 14px;
+    left: auto;
+  }
   .episode-book {
+    top: 218px;
     width: min(680px, calc(100% - 20px));
   }
   .episode-book.closed {
+    top: 640px;
+    width: 140px;
   }
   .book-open-content {
     grid-template-columns: 1fr;
     gap: 10px;
+    inset: 15% 11% 13%;
+    padding: 0 16px;
   }
   .book-open-content h2 {
     margin: 5px 0;
   }
+  .book-evidence-columns {
+    grid-template-columns: 1fr 1fr;
+    gap: 8px;
+  }
+  .book-evidence-list li {
+    font-size: 10px;
+    line-height: 1.12;
+  }
+  .book-field {
+    min-height: 34px;
+    font-size: 10px;
+  }
+  .book-context-field {
+    min-height: 84px;
+  }
   .judge-dais {
     top: 390px;
     width: 280px;
   .puppet.auric {
     left: 20%;
+    top: 970px;
   }
   .puppet.sable {
     left: 80%;
+    top: 970px;
   }
   .speaker-auric .puppet.auric {
     left: 42%;
+    top: 970px;
   }
   .speaker-sable .puppet.sable {
+    left: 80%;
+    top: 970px;
   }
   .puppet.clerk {
     left: 35%;
+    top: 880px;
   }
   .witness-area {
   }
   .jury-benches.left {
+    left: .5%;
   }
   .jury-benches.right {
+    right: .5%;
   }
   .foreground-fence {
+    bottom: -66px;
     width: 64%;
   }
   }
   .judge-table-foreground {
+    top: 213px;
+    width: 646px;
   }
   .evidence-props {
 """
 START_JS = """
+(case_label, search_query, hypothetical, custom_payload, speed, mind_layer) => {
+  const book = document.querySelector('.episode-book.custom-book');
+  const collect = (selector) => Array.from(document.querySelectorAll(selector)).map((node) => node.value || '');
+  const payload = book ? JSON.stringify({
+    context: document.querySelector('.book-context-field')?.value || '',
+    claimant_evidence: collect('.book-claimant-field'),
+    respondent_evidence: collect('.book-respondent-field')
+  }) : (custom_payload || '');
+  if (book) {
+    const data = JSON.parse(payload);
+    const hasContext = data.context.trim().length > 0;
+    const hasClaimant = data.claimant_evidence.some((value) => value.trim().length > 0);
+    const hasRespondent = data.respondent_evidence.some((value) => value.trim().length > 0);
+    if (!hasContext || !hasClaimant || !hasRespondent) {
+      return [case_label, search_query, hypothetical, payload, speed, mind_layer];
+    }
+  }
   document.body.classList.add('trial-has-started');
   if (window.SovereignCourtAudio) {
     window.SovereignCourtAudio.begin();
   }
+  return [case_label, search_query, hypothetical, payload, speed, mind_layer];
 }
 """
         "role": "Court clerk",
         "model": "AgentCPM-Explore",
     },
+    "Mike OSS": {
         "class": "auric",
+        "name": "Mike OSS",
         "role": "Claimant advocate",
         "model": "gpt-oss-20b",
     },
+    "Harvey Vector": {
         "class": "sable",
+        "name": "Harvey Vector",
         "role": "Respondent advocate",
         "model": "gpt-oss-20b",
     },
     "Nemotron Jury": {
         "class": "jury",
         "name": "Nemotron Jury",
     "Jensen Huang": "/gradio_api/file=assets/characters/jensen-huang.png",
 }
+TRIAL_TURN_ORDER = (
+    "Clerk Meridian",
+    JUDGE_NAME,
+    "Mike OSS",
+    "Harvey Vector",
+    JUDGE_NAME,
+    "Mike OSS",
+    "Harvey Vector",
+    "Nemotron Jury",
+    *JUROR_PERSONAS.keys(),
+    JUDGE_NAME,
+)
 PHASE_AGENTS = {
     "pretrial": ["Clerk Meridian"],
 }
+@dataclass(frozen=True)
+class SpeakerCue:
+    name: str
+    role: str
+    text: str
+    pending: bool = False
+_EVENT_STREAM_DONE = object()
 def _remote_events(request: TrialRequest) -> Iterable[TrialEvent] | None:
+    endpoint = os.getenv("MODAL_TRIAL_URL", DEFAULT_MODAL_TRIAL_URL).strip()
     if not endpoint:
         return None
     return iterator()
+def get_events(request: TrialRequest, delay: float | None = None) -> Iterable[TrialEvent]:
     remote = _remote_events(request)
     if remote is not None:
         yield from remote
         return
+    stream_delay = {"swift": 1.4, "measured": 2.4, "ceremonial": 3.4}[request.speed] if delay is None else delay
+    yield from stream_trial(request, delay=stream_delay)
 def _escape(value: str) -> str:
     return event.turns[0].agent
+def _role_for_speaker(name: str, event: TrialEvent | None = None) -> str:
+    if event is not None:
+        turn = next((turn for turn in event.turns if turn.agent == name), None)
+        if turn is not None:
+            return turn.role
+    if name in CHARACTERS:
+        return CHARACTERS[name]["role"]
+    if name in JUROR_FACES:
+        return "juror"
+    return "speaker"
+def _expected_next_speaker(events: list[TrialEvent]) -> SpeakerCue | None:
+    if len(events) >= len(TRIAL_TURN_ORDER):
+        return None
+    name = TRIAL_TURN_ORDER[len(events)]
+    role = _role_for_speaker(name)
+    return SpeakerCue(name=name, role=role, text=f"{name} is preparing a response.", pending=True)
 def _speaker_class_for(speaker: str) -> str:
     if not speaker:
         return ""
     return _short_text(turn.content, 210)
+def _active_speaker_cue(event: TrialEvent | None, pending_speaker: SpeakerCue | None = None) -> SpeakerCue | None:
+    if pending_speaker is not None:
+        return pending_speaker
+    if event is None or not event.turns:
+        return None
+    turn = event.turns[0]
+    text = turn.content.strip()
+    if not text:
+        return None
+    return SpeakerCue(name=turn.agent, role=turn.role, text=text)
+def _reading_duration(text: str) -> float:
+    word_count = len(text.split())
+    return min(MAX_READ_SECONDS, max(MIN_READ_SECONDS, (word_count / WORDS_PER_SECOND) + READ_BUFFER_SECONDS))
+def _event_dialogue_text(event: TrialEvent) -> str:
+    if event.turns:
+        return event.turns[0].content
+    return event.body
+def _event_status(event: TrialEvent, step: int) -> str:
+    if event.turns:
+        return f"Step {step}: {event.turns[0].agent} - {event.title}"
+    return f"Step {step}: {event.title}"
+def _pending_status(cue: SpeakerCue | None) -> str:
+    if cue is None:
+        return "The court is preparing the next turn."
+    return f"{cue.name} is preparing their response."
+def _start_event_producer(request: TrialRequest) -> queue.Queue[object]:
+    events: queue.Queue[object] = queue.Queue()
+    def produce() -> None:
+        try:
+            try:
+                stream = get_events(request, delay=0.0)
+            except TypeError:
+                stream = get_events(request)
+            for event in stream:
+                events.put(event)
+        except Exception as exc:
+            events.put(exc)
+        finally:
+            events.put(_EVENT_STREAM_DONE)
+    threading.Thread(target=produce, name="trial-event-producer", daemon=True).start()
+    return events
 def _thread_id(name: str) -> str:
     return "ai-thread-" + "".join(ch.lower() if ch.isalnum() else "-" for ch in name).strip("-")
     )
+def _active_dialogue(cue: SpeakerCue | None) -> str:
+    if cue is None:
+        return ""
+    speaker_cls = _speaker_class_for(cue.name).strip()
+    classes = ["speech-bubble", "active-dialogue"]
+    if speaker_cls:
+        classes.append(speaker_cls)
+    if cue.name in JUROR_FACES:
+        classes.append("juror-dialogue")
+    if cue.pending:
+        classes.append("pending")
+    pending_attr = " data-pending='true'" if cue.pending else ""
+    return (
+        f"<div class='{' '.join(classes)}' data-speaker='{_escape(cue.name)}'{pending_attr}>"
+        "<div class='dialogue-meta'>"
+        f"<strong>{_escape(cue.name)}</strong>"
+        f"<span>{_escape(cue.role)}</span>"
+        "</div>"
+        f"<div class='dialogue-text'>{_escape(cue.text)}</div>"
+        "</div>"
+    )
+def _verdict_popup(events: list[TrialEvent], show: bool) -> str:
+    if not show:
+        return ""
+    verdict = next((event.verdict for event in reversed(events) if event.verdict is not None), None)
+    if verdict is None:
+        return ""
+    finding = VERDICT_LABELS.get(verdict.finding, verdict.finding.replace("_", " ").title())
+    return (
+        f"<div class='verdict-popup' role='alert' aria-live='assertive' data-finding='{_escape(verdict.finding)}'>"
+        "<span class='verdict-popup-kicker'>Verdict</span>"
+        f"<strong class='verdict-popup-finding'>Verdict: {_escape(finding)}</strong>"
+        f"<p class='verdict-popup-decree'>{_escape(verdict.decree)}</p>"
+        "</div>"
+    )
 def _puppet(agent: str, active_agents: set[str], phase: str, events: list[TrialEvent], latest: TrialEvent | None) -> str:
     meta = CHARACTERS[agent]
     active = " active" if agent in active_agents else ""
+    walking = " walking" if agent in {"Mike OSS", "Harvey Vector"} and agent in active_agents else ""
+    small = " small" if agent == "Clerk Meridian" else ""
     turns = _thread_for_character(events, agent)
     portrait = ""
     if meta.get("image"):
         portrait = (
         f"<a class='puppet {meta['class']}{active}{walking}{small}' href='#{_escape(_thread_id(agent))}' aria-label='Open {_escape(agent)} model thread'>"
         f"{portrait}"
         "<span class='mouth'></span>"
         f"{_tooltip(meta['name'], meta['role'], meta['model'], turns)}"
         "</a>"
     )
     image = JUROR_IMAGES.get(name, "")
     active_cls = " active" if active else ""
     turns = _thread_for_character(events or [], name)
     portrait = (
         f"<img class='juror-portrait' src='{_escape(image)}' alt='{_escape(name)} bust' "
         "onerror=\"this.style.display='none'\">"
         if image
         else ""
     )
+    fallback_art = "" if image else "<span class='juror-face'></span><span class='juror-body'></span>"
     return (
         f"<a class='juror{active_cls}' href='#{_escape(_thread_id(name))}' style='--face: {face}' aria-label='Open {_escape(name)} model thread'>"
         f"{portrait}"
+        f"{fallback_art}"
         f"{_tooltip(name, 'HF-style juror', 'Nemotron panel', turns)}"
         "</a>"
     )
+def _packet_for_label(case_label: str) -> CasePacket:
+    return get_case(CASE_OPTIONS.get(case_label, "socrates"))
+def _split_evidence(packet: CasePacket) -> tuple[list[EvidenceItem], list[EvidenceItem]]:
+    claimant = [item for item in packet.evidence if item.supports == "claimant"]
+    respondent = [item for item in packet.evidence if item.supports == "respondent"]
+    if len(claimant) < 3:
+        claimant.extend(item for item in packet.evidence if item.supports in {"mixed", "context"} and item not in claimant)
+    if len(respondent) < 3:
+        respondent.extend(item for item in packet.evidence if item.supports in {"mixed", "context"} and item not in respondent)
+    return claimant[:3], respondent[:3]
+def _book_evidence_column(title: str, items: list[EvidenceItem]) -> str:
+    entries = "".join(
+        "<li>"
+        f"<strong>{_escape(item.title)}</strong><br>"
+        f"{_escape(item.note)}"
+        "</li>"
+        for item in items
+    )
+    return (
+        "<section class='book-evidence-column'>"
+        f"<h3>{_escape(title)}</h3>"
+        f"<ul class='book-evidence-list'>{entries}</ul>"
+        "</section>"
+    )
+def _custom_evidence_fields(class_name: str, label: str) -> str:
+    fields = "".join(
+        f"<textarea class='book-field {class_name}' aria-label='{_escape(label)} {index}' "
+        f"placeholder='{_escape(label)} {index}'></textarea>"
+        for index in range(1, 4)
+    )
+    return f"<section class='book-evidence-column'><h3>{_escape(label)}</h3>{fields}</section>"
+def _book(open_book: bool, packet: CasePacket | None = None, custom_mode: bool = False) -> str:
     closed = "" if open_book else " closed"
+    custom_class = " custom-book" if custom_mode and open_book else ""
+    hidden_attr = "" if custom_mode and open_book else " aria-hidden='true'"
+    packet = packet or get_case("socrates")
+    if custom_mode and open_book:
+        left_page = (
+            "<section><h2>Trial details</h2>"
+            "<textarea class='book-field book-context-field' aria-label='Custom trial details' "
+            "placeholder='Write a short paragraph describing what happened and why the court is hearing it.'></textarea>"
+            "</section>"
+        )
+        right_page = (
+            "<section><h2>Evidence</h2><div class='book-evidence-columns'>"
+            f"{_custom_evidence_fields('book-claimant-field', 'Evidence for Claimant')}"
+            f"{_custom_evidence_fields('book-respondent-field', 'Evidence against Claimant')}"
+            "</div></section>"
+        )
+    else:
+        claimant_evidence, respondent_evidence = _split_evidence(packet)
+        left_page = (
+            "<section><h2>Trial details</h2>"
+            f"<p class='book-case-title'>{_escape(packet.title)}</p>"
+            f"<p class='book-context'>{_escape(packet.context or packet.setting)}</p>"
+            f"<div class='book-entry'><strong>{_escape(packet.claimant)}</strong><br>{_escape(packet.claimant_claim)}</div>"
+            f"<div class='book-entry'><strong>{_escape(packet.respondent)}</strong><br>{_escape(packet.respondent_claim)}</div>"
+            "</section>"
+        )
+        right_page = (
+            "<section><h2>Evidence</h2><div class='book-evidence-columns'>"
+            f"{_book_evidence_column(f'Evidence for {packet.claimant}', claimant_evidence)}"
+            f"{_book_evidence_column(f'Evidence for {packet.respondent}', respondent_evidence)}"
+            "</div></section>"
+        )
     return (
+        f"<div class='episode-book{closed}{custom_class}'>"
         "<img class='book-art open-art' src='/gradio_api/file=assets/book/docket-book-open.png' alt='Open docket book'>"
         "<img class='book-art closed-art' src='/gradio_api/file=assets/book/docket-book-closed.png' alt='Closed docket book'>"
+        f"<div class='book-open-content'{hidden_attr}>"
+        f"{left_page}"
+        f"{right_page}"
+        "</div>"
         "</div>"
     )
     )
+def _trial_progress(events: list[TrialEvent]) -> str:
+    latest = events[-1] if events else None
+    current_phase = latest.phase if latest else "pretrial"
+    stage_keys = [key for key, _label in TRIAL_PROGRESS_STAGES]
+    current_index = stage_keys.index(current_phase) if current_phase in stage_keys else None
+    segments = []
+    for index, (key, label) in enumerate(TRIAL_PROGRESS_STAGES):
+        classes = ["trial-progress-segment"]
+        attrs = [f"data-phase='{_escape(key)}'"]
+        if current_index is not None and index < current_index:
+            classes.append("complete")
+        if current_index == index:
+            classes.append("current")
+            attrs.append("aria-current='step'")
+            if key == "verdict":
+                classes.append("complete")
+        abbrev = label[:3]
+        segments.append(
+            f"<span class='{' '.join(classes)}' {' '.join(attrs)}>"
+            f"<span class='trial-progress-label'>{_escape(label)}</span>"
+            f"<span class='trial-progress-abbrev' aria-hidden='true'>{_escape(abbrev)}</span>"
+            "</span>"
+        )
+    return (
+        "<nav class='trial-progress' aria-label='Trial progress'>"
+        + "".join(segments)
+        + "</nav>"
+    )
 def _courtroom_juror_names(votes: list) -> list[str]:
     names = list(JUROR_FACES)
     names.extend(vote.juror for vote in votes if vote.juror not in names)
     return ordered
+def render_court(
+    events: list[TrialEvent],
+    started: bool = False,
+    pending_speaker: SpeakerCue | None = None,
+    show_verdict_popup: bool = False,
+    pretrial_case: CasePacket | None = None,
+    custom_mode: bool = False,
+) -> str:
     latest = events[-1] if events else None
     phase = latest.phase if latest else "pretrial"
     title, subtitle = _latest_packet_title(events)
+    active_cue = _active_speaker_cue(latest, pending_speaker)
+    active_speaker = active_cue.name if active_cue is not None else _active_speaker_for(latest)
+    active_agents = {active_speaker} if active_speaker else _active_agents_for(latest)
     speaker_cls = _speaker_class_for(active_speaker)
     caption_phase, caption_title, caption_body = _caption(latest, phase)
     latest_votes = _latest_votes(events)
     book_open = not started and not events
     puppets = "".join(
         _puppet(agent, active_agents, phase, events, latest)
+        for agent in [JUDGE_NAME, "Clerk Meridian", "Mike OSS", "Harvey Vector"]
     )
     left_jurors = "".join(_juror(name, name == active_speaker, events, latest) for name in juror_names[:3])
     right_jurors = "".join(_juror(name, name == active_speaker, events, latest) for name in juror_names[3:6])
     )
     return (
         f"<section id='court-stage' class='court-episode-stage phase-{_escape(phase)}{_escape(speaker_cls)}{started_cls}' data-phase='{_escape(phase)}'>"
+        f"{_trial_progress(events)}"
         "<div class='episode-room'></div>"
         "<div class='audio-deck' aria-hidden='true'>"
         + "".join(f"<audio preload='auto' src='{_escape(src)}'></audio>" for src in AUDIO_PATHS.values())
         f"<h1>{_escape(title)}</h1>"
         f"<p>{_escape(subtitle)}</p></div>"
         f"<div class='decree-ribbon'>Step {len(events) if events else 0}: {caption_title}<br>Hover characters for agent and model details</div>"
+        f"{_book(book_open, pretrial_case, custom_mode)}"
         f"<div class='judge-dais'><div class='prop-label'>{_escape(JUDGE_NAME)}</div><div class='bench-front'></div><span class='gavel'></span></div>"
         "<div class='counsel-table left'><div class='prop-label'>Claimant Table</div></div>"
         "<div class='counsel-table right'><div class='prop-label'>Respondent Table</div></div>"
         f"{puppets}"
         f"{evidence_props}"
         f"{_foreground_props()}"
+        f"{_active_dialogue(active_cue)}"
+        f"{_verdict_popup(events, show_verdict_popup)}"
         "<div class='gallery-benches'><div></div><div></div><div></div><div></div><div></div><div></div></div>"
         "<div class='trial-caption'>"
         f"<div class='caption-phase'>Live Trial Feed / {_escape(caption_phase)}</div>"
     return f"<pre class='mind-text'>{_escape(json.dumps(compact, indent=2))}</pre>"
+def _clean_custom_items(values: list[str]) -> list[str]:
+    return [" ".join(value.split()) for value in values if " ".join(value.split())]
+def _custom_case_from_payload(payload: str) -> CasePacket:
+    try:
+        data = json.loads(payload or "{}")
+    except json.JSONDecodeError as exc:
+        raise ValueError("Custom case details could not be read from the docket book.") from exc
+    context = " ".join(str(data.get("context", "")).split())
+    claimant_items = _clean_custom_items([str(value) for value in data.get("claimant_evidence", [])])
+    respondent_items = _clean_custom_items([str(value) for value in data.get("respondent_evidence", [])])
+    if not context:
+        raise ValueError("Custom requires a trial details paragraph.")
+    if not claimant_items or not respondent_items:
+        raise ValueError("Custom requires at least one evidence item for each side.")
+    evidence = [
+        EvidenceItem(
+            id=f"CUS-F{index}",
+            title=f"Claimant Evidence {index}",
+            source="Custom docket entry",
+            excerpt=item,
+            supports="claimant",
+            reliability=0.65,
+            note=item,
+        )
+        for index, item in enumerate(claimant_items[:3], start=1)
+    ]
+    evidence.extend(
+        EvidenceItem(
+            id=f"CUS-A{index}",
+            title=f"Respondent Evidence {index}",
+            source="Custom docket entry",
+            excerpt=item,
+            supports="respondent",
+            reliability=0.65,
+            note=item,
+        )
+        for index, item in enumerate(respondent_items[:3], start=1)
+    )
+    return CasePacket(
+        id="custom",
+        title="Custom Trial",
+        subtitle="A custom docket assembled in the opening book.",
+        claimant="Claimant",
+        respondent="Respondent",
+        charge="Whether the custom record supports the claimant or the respondent.",
+        setting="A custom courtroom packet entered by the user.",
+        context=context,
+        claimant_claim="The claimant says the custom context and supporting evidence justify a favorable finding.",
+        respondent_claim="The respondent says the custom context is incomplete, overread, or answered by contrary evidence.",
+        source_note="Custom user-entered case packet from the docket book.",
+        evidence=evidence,
+    )
+def render_case_preview(case_label: str) -> str:
+    case_id = CASE_OPTIONS.get(case_label, "socrates")
+    return render_court(
+        [],
+        pretrial_case=get_case(case_id) if case_id != "custom" else None,
+        custom_mode=case_id == "custom",
+    )
+def run_ui(
+    case_label: str,
+    search_query: str,
+    hypothetical: str,
+    custom_payload: str,
+    speed: str,
+    mind_layer: bool,
+):
+    case_id = CASE_OPTIONS.get(case_label, "socrates")
+    try:
+        custom_case = _custom_case_from_payload(custom_payload) if case_id == "custom" else None
+    except ValueError as exc:
+        yield (
+            render_court([], pretrial_case=None, custom_mode=True),
+            render_evidence([]),
+            render_jurors([]),
+            render_mind([], mind_layer),
+            str(exc),
+        )
+        return
     request = TrialRequest(
+        case_id=case_id,
         search_query=search_query or "",
         hypothetical=hypothetical or "",
+        custom_case=custom_case,
         speed=speed or "swift",
         mind_layer=bool(mind_layer),
     )
     events: list[TrialEvent] = []
+    produced_events = _start_event_producer(request)
+    pending_speaker = _expected_next_speaker(events)
     yield (
+        render_court(events, started=True, pending_speaker=pending_speaker),
         render_evidence(events),
         render_jurors(events),
         render_mind(events, mind_layer),
+        _pending_status(pending_speaker),
     )
     try:
+        while True:
+            item = produced_events.get()
+            if item is _EVENT_STREAM_DONE:
+                break
+            if isinstance(item, Exception):
+                raise item
+            event = item
             events.append(event)
             yield (
                 render_court(events, started=True),
                 render_evidence(events),
                 render_jurors(events),
                 render_mind(events, mind_layer),
+                _event_status(event, len(events)),
             )
+            duration = _reading_duration(_event_dialogue_text(event))
+            if duration > 0:
+                time.sleep(duration)
+            pending_speaker = _expected_next_speaker(events)
+            if pending_speaker is not None and produced_events.empty():
+                yield (
+                    render_court(events, started=True, pending_speaker=pending_speaker),
+                    render_evidence(events),
+                    render_jurors(events),
+                    render_mind(events, mind_layer),
+                    _pending_status(pending_speaker),
+                )
     except Exception as exc:
         yield (
             render_court(events, started=True),
         )
         return
     yield (
+        render_court(events, started=True, show_verdict_popup=True),
         render_evidence(events),
         render_jurors(events),
         render_mind(events, mind_layer),
                 )
                 start = gr.Button("Begin Trial", variant="primary", scale=1)
             status = gr.Markdown("Ready.", elem_classes=["book-status"])
+        courtroom = gr.HTML(render_case_preview("Trial of Socrates"), label="Live courtroom")
         search = gr.State("")
+        hypo = gr.State("")
+        custom_payload = gr.State("")
         speed = gr.State("swift")
         mind = gr.State(True)
         with gr.Row(elem_classes=["drawer-shell"]):
             with gr.Column(scale=1):
                 with gr.Tab("Evidence Drawer"):
                 with gr.Tab("Juror Panel"):
                     jurors = gr.HTML(render_jurors([]))
                 mind_html = gr.HTML(render_mind([], True), visible=False)
+        case.change(
+            render_case_preview,
+            inputs=[case],
+            outputs=[courtroom],
+        )
         start.click(
             run_ui,
+            inputs=[case, search, hypo, custom_payload, speed, mind],
             outputs=[courtroom, evidence, jurors, mind_html, status],
             js=START_JS,
         )

modal_app.py CHANGED Viewed

@@ -3,7 +3,7 @@ import time
 import modal
-from sovereign_bench.engine import stream_trial_jsonl
 from sovereign_bench.llm import (
     ModelCall,
     ModelResult,
@@ -12,10 +12,12 @@ from sovereign_bench.llm import (
 )
 from sovereign_bench.models import TrialRequest
-app = modal.App("sovereign-bench")
 GPU_NAME = "H100"
 GPU_TIMEOUT_SECONDS = 20 * 60
 HF_CACHE_DIR = "/root/.cache/huggingface"
 image = (
     modal.Image.debian_slim(python_version="3.12")
@@ -89,7 +91,8 @@ class VllmModel:
                 "role": "user",
                 "content": (
                     "Your previous response did not include visible courtroom dialogue. "
-                    "Return only the final spoken dialogue now. Do not include <think>, analysis, reasoning, markdown, or notes. /no_think"
                 ),
             }
         ]
@@ -115,6 +118,10 @@ class VllmModel:
             "latency_ms": int((time.perf_counter() - started) * 1000),
         }
 def modal_gpu_enabled() -> bool:
     return os.getenv("SOVEREIGN_DISABLE_MODAL_GPU", "").lower() not in {"1", "true", "yes"}
@@ -127,6 +134,9 @@ def modal_gpu_runner(**kwargs) -> ModelResult:
         case_summary=kwargs["case_summary"],
         task=kwargs["task"],
         evidence_summary=kwargs["evidence_summary"],
     )
     requested_model = kwargs["model"]
     prompt_hash = messages_hash(messages)
@@ -191,3 +201,12 @@ def trial_stream(payload: dict):
 @app.local_entrypoint()
 def main():
     print(check_huggingface_connection.remote())

 import modal
+from sovereign_bench.engine import MODEL_BUDGET, stream_trial_jsonl
 from sovereign_bench.llm import (
     ModelCall,
     ModelResult,
 )
 from sovereign_bench.models import TrialRequest
+MODAL_APP_NAME = "sovereign-bench"
+app = modal.App(MODAL_APP_NAME)
 GPU_NAME = "H100"
 GPU_TIMEOUT_SECONDS = 20 * 60
 HF_CACHE_DIR = "/root/.cache/huggingface"
+USED_MODEL_IDS = tuple(dict.fromkeys(model for _, model, _ in MODEL_BUDGET))
 image = (
     modal.Image.debian_slim(python_version="3.12")
                 "role": "user",
                 "content": (
                     "Your previous response did not include visible courtroom dialogue. "
+                    "Return only the final answer now. Do not mention prompts, tasks, requirements, or that you are following instructions. "
+                    "Do not include <think>, analysis, reasoning, markdown, narration, or notes. /no_think"
                 ),
             }
         ]
             "latency_ms": int((time.perf_counter() - started) * 1000),
         }
+    @modal.method()
+    def warm(self) -> dict:
+        return {"model": self.model_id, "status": "warm"}
 def modal_gpu_enabled() -> bool:
     return os.getenv("SOVEREIGN_DISABLE_MODAL_GPU", "").lower() not in {"1", "true", "yes"}
         case_summary=kwargs["case_summary"],
         task=kwargs["task"],
         evidence_summary=kwargs["evidence_summary"],
+        trial_history=kwargs.get("trial_history", ""),
+        persona=kwargs.get("persona", ""),
+        objective=kwargs.get("objective", ""),
     )
     requested_model = kwargs["model"]
     prompt_hash = messages_hash(messages)
 @app.local_entrypoint()
 def main():
     print(check_huggingface_connection.remote())
+@app.local_entrypoint()
+def warm_models():
+    deployed_model = modal.Cls.from_name(MODAL_APP_NAME, "VllmModel")
+    for model_id in USED_MODEL_IDS:
+        model = deployed_model(model_id=model_id)
+        model.update_autoscaler(min_containers=1)
+        print(model.warm.remote())

sovereign_bench/cases.py CHANGED Viewed

@@ -11,6 +11,11 @@ SOCRATES = CasePacket(
     respondent="Socrates",
     charge="Corrupting the youth and refusing the sanctioned gods of the city.",
     setting="Athens, 399 BCE, reassembled inside a pocket tribunal.",
     claimant_claim=(
         "The city argues that Socrates trained young citizens to mock public authority "
         "and placed private daimonion guidance above civic religion."
@@ -25,19 +30,7 @@ SOCRATES = CasePacket(
     ),
     evidence=[
         EvidenceItem(
-            id="SOC-E1",
-            title="The Oracle Burden",
-            source="Plato, Apology tradition",
-            excerpt=(
-                "Socrates describes testing reputedly wise citizens after a Delphic oracle "
-                "report, creating public embarrassment but framing the act as duty."
-            ),
-            supports="mixed",
-            reliability=0.78,
-            note="Shows both civic irritation and a claimed religious motivation.",
-        ),
-        EvidenceItem(
-            id="SOC-E2",
             title="Youthful Imitators",
             source="Plato, Apology tradition",
             excerpt=(
@@ -49,7 +42,31 @@ SOCRATES = CasePacket(
             note="Supports social effect, but does not prove intentional corruption.",
         ),
         EvidenceItem(
-            id="SOC-E3",
             title="No Fee, No School",
             source="Ancient defense tradition",
             excerpt=(
@@ -61,16 +78,127 @@ SOCRATES = CasePacket(
             note="Weakens the claim that he operated a formal corrupting academy.",
         ),
         EvidenceItem(
-            id="SOC-E4",
-            title="The Daimonion",
-            source="Ancient biographical tradition",
             excerpt=(
-                "Socrates reports a private divine sign that restrains him from certain actions, "
-                "which the court may read as piety or heterodoxy."
             ),
-            supports="mixed",
-            reliability=0.64,
-            note="Central ambiguity: private religious experience versus civic irreverence.",
         ),
     ],
 )
@@ -84,6 +212,11 @@ BARNABY = CasePacket(
     respondent="Barnaby Buttons",
     charge="Theft of the final mooncake and alteration of the communal snack ledger.",
     setting="A fluorescent office kitchen at 4:47 p.m., under the humming republic of the fridge.",
     claimant_claim=(
         "Barnaby removed the final mooncake, changed the snack ledger from '1 mooncake' "
         "to '0 mooncakes', and left the team dessertless."
@@ -92,7 +225,7 @@ BARNABY = CasePacket(
         "Barnaby says the mooncake was already abandoned, the ledger pen skipped naturally, "
         "and the crumbs came from an unrelated biscuit."
     ),
-    source_note="Cached original whimsical packet made for reliable hackathon demos.",
     evidence=[
         EvidenceItem(
             id="BTN-E1",
@@ -134,7 +267,7 @@ BARNABY = CasePacket(
 )
-CASES = {case.id: case for case in (SOCRATES, BARNABY)}
 def get_case(case_id: str) -> CasePacket:

     respondent="Socrates",
     charge="Corrupting the youth and refusing the sanctioned gods of the city.",
     setting="Athens, 399 BCE, reassembled inside a pocket tribunal.",
+    context=(
+        "Athens has brought Socrates back before a civic court after years of public questioning, "
+        "youthful imitators, and anxiety about private religious claims. The city says his method "
+        "weakened civic order; Socrates says he served the public by exposing false wisdom."
+    ),
     claimant_claim=(
         "The city argues that Socrates trained young citizens to mock public authority "
         "and placed private daimonion guidance above civic religion."
     ),
     evidence=[
         EvidenceItem(
+            id="SOC-F1",
             title="Youthful Imitators",
             source="Plato, Apology tradition",
             excerpt=(
             note="Supports social effect, but does not prove intentional corruption.",
         ),
         EvidenceItem(
+            id="SOC-F2",
+            title="Public Embarrassment",
+            source="Ancient defense tradition",
+            excerpt=(
+                "Socrates describes testing reputedly wise citizens in public after hearing the "
+                "Delphic oracle report."
+            ),
+            supports="claimant",
+            reliability=0.74,
+            note="Shows a repeated practice that made civic leaders look foolish.",
+        ),
+        EvidenceItem(
+            id="SOC-F3",
+            title="The Daimonion Suspicion",
+            source="Ancient biographical tradition",
+            excerpt=(
+                "Socrates reports a private divine sign that restrains him from certain actions, "
+                "which civic accusers read as religious irregularity."
+            ),
+            supports="claimant",
+            reliability=0.64,
+            note="Supports the impiety theory if private revelation is treated as civic defiance.",
+        ),
+        EvidenceItem(
+            id="SOC-A1",
             title="No Fee, No School",
             source="Ancient defense tradition",
             excerpt=(
             note="Weakens the claim that he operated a formal corrupting academy.",
         ),
         EvidenceItem(
+            id="SOC-A2",
+            title="Oracle as Duty",
+            source="Plato, Apology tradition",
             excerpt=(
+                "Socrates frames his questioning as obedience to a divine puzzle rather than "
+                "contempt for religion."
             ),
+            supports="respondent",
+            reliability=0.78,
+            note="Turns the impiety charge into a competing account of piety.",
+        ),
+        EvidenceItem(
+            id="SOC-A3",
+            title="Cross-Examination as Service",
+            source="Defense summary",
+            excerpt=(
+                "The defense treats uncomfortable questioning as civic improvement, not sabotage "
+                "or intentional corruption."
+            ),
+            supports="respondent",
+            reliability=0.7,
+            note="Gives the jury a public-interest reason to tolerate Socrates.",
+        ),
+    ],
+)
+GREG = CasePacket(
+    id="greg",
+    title="Greg Heffley v. Mom",
+    subtitle="A family-court argument over a diary, embarrassment, and parental good intentions.",
+    claimant="Greg Heffley",
+    respondent="Susan Heffley",
+    charge="Whether Mom wrongfully saddled Greg with an embarrassing diary instead of a normal journal.",
+    setting="The Heffley house on the eve of another middle-school year.",
+    context=(
+        "Greg receives a book from his mom meant to help him record his thoughts, but he objects "
+        "that the word diary makes him look childish and vulnerable at school. Mom treats the book "
+        "as a harmless tool for reflection; Greg treats it as social evidence waiting to be used "
+        "against him."
+    ),
+    claimant_claim=(
+        "Greg argues that Mom ignored the obvious social risk of handing a middle-school boy a diary "
+        "and failed to respect how easily classmates can turn an object into humiliation."
+    ),
+    respondent_claim=(
+        "Mom answers that the writing book is a constructive outlet, that Greg can choose how to use it, "
+        "and that parental encouragement is not social sabotage."
+    ),
+    source_note=(
+        "Cached demo packet using paraphrased context from the Diary of a Wimpy Kid setup. "
+        "No book text is quoted."
+    ),
+    evidence=[
+        EvidenceItem(
+            id="GRG-F1",
+            title="The Label Problem",
+            source="Greg's objection",
+            excerpt=(
+                "Greg objects that diary is the wrong label for a middle-school boy and could be "
+                "used to mock him."
+            ),
+            supports="claimant",
+            reliability=0.74,
+            note="Shows a foreseeable embarrassment risk from Greg's perspective.",
+        ),
+        EvidenceItem(
+            id="GRG-F2",
+            title="Middle-School Audience",
+            source="School context",
+            excerpt=(
+                "Greg's social world rewards status and punishes anything classmates can frame "
+                "as childish."
+            ),
+            supports="claimant",
+            reliability=0.7,
+            note="Makes the harm plausible even before anyone finds the book.",
+        ),
+        EvidenceItem(
+            id="GRG-F3",
+            title="Ignored Preference",
+            source="Family exchange summary",
+            excerpt=(
+                "Greg wanted distance from the diary framing, but Mom treated the gift as settled."
+            ),
+            supports="claimant",
+            reliability=0.66,
+            note="Supports Greg's autonomy argument, though parents often choose school supplies.",
+        ),
+        EvidenceItem(
+            id="GRG-A1",
+            title="Private Writing Tool",
+            source="Mom's purpose",
+            excerpt=(
+                "Mom intended the book as a private place for Greg to record his thoughts and school year."
+            ),
+            supports="respondent",
+            reliability=0.78,
+            note="Shows a constructive parental purpose rather than intent to embarrass.",
+        ),
+        EvidenceItem(
+            id="GRG-A2",
+            title="Greg Controls Disclosure",
+            source="Household facts",
+            excerpt=(
+                "The book is not inherently public; Greg can keep it private and decide what to write."
+            ),
+            supports="respondent",
+            reliability=0.68,
+            note="Weakens the claim that the gift itself creates inevitable harm.",
+        ),
+        EvidenceItem(
+            id="GRG-A3",
+            title="Reflection Has Value",
+            source="Parenting rationale",
+            excerpt=(
+                "A journal can help a student process school, family, and growing-up pressures."
+            ),
+            supports="respondent",
+            reliability=0.71,
+            note="Gives Mom a reasonable-benefit argument even if the branding is awkward.",
         ),
     ],
 )
     respondent="Barnaby Buttons",
     charge="Theft of the final mooncake and alteration of the communal snack ledger.",
     setting="A fluorescent office kitchen at 4:47 p.m., under the humming republic of the fridge.",
+    context=(
+        "An office breakroom has lost its final mooncake after a suspicious ledger update and "
+        "a trail of crumbs. The commonwealth blames Barnaby Buttons; Barnaby says the evidence "
+        "is ordinary office mess and coincidence."
+    ),
     claimant_claim=(
         "Barnaby removed the final mooncake, changed the snack ledger from '1 mooncake' "
         "to '0 mooncakes', and left the team dessertless."
         "Barnaby says the mooncake was already abandoned, the ledger pen skipped naturally, "
         "and the crumbs came from an unrelated biscuit."
     ),
+    source_note="Cached original whimsical packet kept for compatibility with older tests.",
     evidence=[
         EvidenceItem(
             id="BTN-E1",
 )
+CASES = {case.id: case for case in (SOCRATES, GREG, BARNABY)}
 def get_case(case_id: str) -> CasePacket:

sovereign_bench/engine.py CHANGED Viewed

@@ -9,7 +9,7 @@ from collections.abc import Callable, Iterable
 from pydantic import ValidationError
 from .cases import get_case
-from .llm import ModelCall, ModelResult, call_small_model
 from .models import AgentTurn, CasePacket, JurorVote, TrialEvent, TrialRequest, Verdict
 from .retrieval import build_live_case
@@ -20,11 +20,11 @@ OPENAI_PROVIDER = "auto"
 OPENBMB_PROVIDER = "featherless-ai"
 NEMOTRON_PROVIDER = "featherless-ai"
-MODEL_BUDGET = [
-    ("Presiding Advocate", GPT_OSS_MODEL, 20.0),
-    ("Clerk of Style", OPENBMB_MODEL, 4.0),
-    ("Juror/Auditor Ring", NEMOTRON_MODEL, 8.0),
-]
 TOTAL_PARAMS_B = sum(item[2] for item in MODEL_BUDGET)
 JUDGE_NAME = "Marcus Aurelius"
@@ -59,12 +59,14 @@ def _turn(agent: str, role: str, result: ModelResult, model: str, confidence: fl
     )
-def _case_summary(packet: CasePacket) -> str:
-    return (
-        f"{packet.title}. Charge: {packet.charge}\n"
-        f"Claimant: {packet.claimant_claim}\n"
-        f"Respondent: {packet.respondent_claim}"
-    )
 def _evidence_summary(packet: CasePacket) -> str:
@@ -78,8 +80,12 @@ def _call_trace(calls: list[ModelCall]) -> list[dict]:
     return [call.__dict__ for call in calls]
-def resolve_case(request: TrialRequest) -> tuple[CasePacket, dict]:
-    if request.case_id == "live":
         packet = build_live_case(request.search_query, request.hypothetical)
         if packet:
             return packet, {"mode": "live"}
@@ -99,12 +105,16 @@ def _required_role(model_runner: ModelRunner | None, model_calls: list[ModelCall
     except Exception as exc:
         raise RequiredModelError(f"{kwargs.get('agent', 'Model')} unavailable: {exc}") from exc
     model_calls.append(result.call)
-    if not result.call.ok:
-        error = result.call.error or "model call did not complete"
-        raise RequiredModelError(f"{kwargs.get('agent', 'Model')} unavailable: {error}")
-    if not result.text.strip():
-        raise RequiredModelError(f"{kwargs.get('agent', 'Model')} returned an empty response.")
-    return result
 def _trace(packet: CasePacket, source_trace: dict, model_calls: list[ModelCall]) -> dict:
@@ -119,7 +129,7 @@ def _trace(packet: CasePacket, source_trace: dict, model_calls: list[ModelCall])
     }
-def _emit(
     packet: CasePacket,
     source_trace: dict,
     model_calls: list[ModelCall],
@@ -129,10 +139,47 @@ def _emit(
     event.trace = _trace(packet, source_trace, model_calls)
     if delay > 0:
         time.sleep(delay)
-    return event
-def _extract_json(text: str) -> object:
     stripped = text.strip()
     if stripped.startswith("```"):
         stripped = re.sub(r"^```(?:json)?\s*", "", stripped, flags=re.I)
@@ -146,41 +193,37 @@ def _extract_json(text: str) -> object:
         return json.loads(match.group(1))
-def _parse_jury_votes(result: ModelResult, packet: CasePacket) -> list[JurorVote]:
-    try:
-        data = _extract_json(result.text)
-    except json.JSONDecodeError as exc:
-        raise RequiredModelError(f"Nemotron Jury returned invalid JSON: {exc.msg}") from exc
-    raw_votes = data.get("votes") if isinstance(data, dict) else data
-    if not isinstance(raw_votes, list):
-        raise RequiredModelError("Nemotron Jury output must contain a votes list.")
-    if len(raw_votes) != len(JUROR_NAMES):
-        raise RequiredModelError("Nemotron Jury must return exactly six juror votes.")
-    known_evidence = {item.id for item in packet.evidence}
-    votes: list[JurorVote] = []
-    try:
-        for item in raw_votes:
-            vote = JurorVote.model_validate(item)
-            votes.append(vote)
-    except ValidationError as exc:
-        raise RequiredModelError(f"Nemotron Jury vote schema is invalid: {exc.errors()[0]['msg']}") from exc
-    if [vote.juror for vote in votes] != JUROR_NAMES:
-        raise RequiredModelError("Nemotron Jury must return votes in the fixed juror order.")
-    for vote in votes:
-        expected_persona = JUROR_PERSONAS[vote.juror]
-        if vote.persona.strip().lower() != expected_persona:
-            raise RequiredModelError(f"{vote.juror} persona must be '{expected_persona}'.")
-        if not vote.reason.strip():
-            raise RequiredModelError(f"{vote.juror} must include a rationale.")
-        if not vote.evidence_ids or any(evidence_id not in known_evidence for evidence_id in vote.evidence_ids):
-            raise RequiredModelError(f"{vote.juror} must cite known evidence IDs.")
-    return votes
-def _majority_finding(votes: list[JurorVote]) -> str:
     counts = Counter(vote.vote for vote in votes)
     top = counts.most_common()
     if not top:
@@ -227,15 +270,12 @@ def _verdict_from_votes(votes: list[JurorVote]) -> Verdict:
     )
-def _jury_task() -> str:
-    personas = "\n".join(f"- {name}: {persona}" for name, persona in JUROR_PERSONAS.items())
     return (
-        "Return JSON only with a top-level 'votes' array. Create exactly one vote for each juror, in this order: "
-        f"{', '.join(JUROR_NAMES)}. Valid vote values are liable, not_liable, uncertain. Each item must contain "
-        "juror, persona, vote, reason, and evidence_ids. The persona value must exactly match the profile below. "
-        "Each reason should be one concise sentence and each evidence_ids list must cite evidence IDs from the record. "
-        "Vote through the named public-history worldview, not a generic juror role.\n"
-        f"{personas}"
     )
@@ -249,10 +289,11 @@ def stream_trial(
     model_runner: ModelRunner | None = None,
 ) -> Iterable[TrialEvent]:
     packet, source_trace = resolve_case(request)
-    case_summary = _case_summary(packet)
-    evidence_summary = _evidence_summary(packet)
-    model_calls: list[ModelCall] = []
-    hypo = request.hypothetical.strip()
     hypo_line = f"\n\nUser hypothetical admitted as a blue-ribbon sidebar: {hypo}" if hypo else ""
     clerk = _required_role(
@@ -263,14 +304,15 @@ def stream_trial(
         model=OPENBMB_MODEL,
         case_summary=case_summary,
         evidence_summary=evidence_summary,
-        task="Announce the case by name, identify the parties, and read the charge.",
         provider=OPENBMB_PROVIDER,
         max_tokens=110,
     )
-    yield _emit(
-        packet,
-        source_trace,
-        model_calls,
         TrialEvent(
             phase="intake",
             title="The Court Convenes",
@@ -289,17 +331,21 @@ def stream_trial(
         model=GPT_OSS_MODEL,
         case_summary=case_summary,
         evidence_summary=evidence_summary,
         task=(
             f"As {JUDGE_NAME}, a Stoic courtroom judge guided by {JUDGE_PERSONA}, explain the proceeding "
-            "and the burden of proof in one or two disciplined sentences."
         ),
         provider=OPENAI_PROVIDER,
         max_tokens=110,
     )
-    yield _emit(
-        packet,
-        source_trace,
-        model_calls,
         TrialEvent(
             phase="intake",
             title="The Burden Is Set",
@@ -313,24 +359,27 @@ def stream_trial(
     claimant_opening = _required_role(
         model_runner,
         model_calls,
-        agent="Advocate Auric",
         role="claimant advocate",
         model=GPT_OSS_MODEL,
-        case_summary=case_summary,
-        evidence_summary=evidence_summary,
-        task="Make the claimant's opening statement alone. Cite the strongest claimant-side exhibit.",
         provider=OPENAI_PROVIDER,
         max_tokens=130,
     )
-    yield _emit(
-        packet,
-        source_trace,
-        model_calls,
         TrialEvent(
             phase="claims",
             title="Claimant Opening",
             body=packet.claimant_claim,
-            turns=[_turn("Advocate Auric", "claimant advocate", claimant_opening, GPT_OSS_MODEL, 0.88)],
             evidence=packet.evidence,
         ),
         delay,
@@ -339,53 +388,45 @@ def stream_trial(
     respondent_opening = _required_role(
         model_runner,
         model_calls,
-        agent="Counsel Sable",
         role="respondent advocate",
         model=GPT_OSS_MODEL,
-        case_summary=case_summary,
-        evidence_summary=evidence_summary,
-        task="Make the respondent's opening statement alone. Emphasize uncertainty and cite a helpful exhibit.",
         provider=OPENAI_PROVIDER,
         max_tokens=130,
     )
-    yield _emit(
-        packet,
-        source_trace,
-        model_calls,
         TrialEvent(
             phase="opening",
             title="Respondent Opening",
             body=packet.respondent_claim,
-            turns=[_turn("Counsel Sable", "respondent advocate", respondent_opening, GPT_OSS_MODEL, 0.88)],
             evidence=packet.evidence,
         ),
         delay,
     )
-    auditor = _required_role(
-        model_runner,
-        model_calls,
-        agent="Auditor Prism",
-        role="evidence auditor",
-        model=NEMOTRON_MODEL,
-        case_summary=case_summary,
-        evidence_summary=evidence_summary,
-        task="Present the evidence record. Identify the strongest exhibit and the weakest inference.",
-        provider=NEMOTRON_PROVIDER,
-        max_tokens=150,
-    )
-    yield _emit(
-        packet,
-        source_trace,
-        model_calls,
-        TrialEvent(
-            phase="evidence",
-            title="The Record Is Audited",
-            body="\n".join(f"{item.id}: {item.title} | reliability {item.reliability:.2f} | {item.note}" for item in packet.evidence),
-            turns=[_turn("Auditor Prism", "evidence auditor", auditor, NEMOTRON_MODEL, 0.86)],
-            evidence=packet.evidence,
-        ),
-        delay,
     )
     judge_question = _required_role(
@@ -396,17 +437,21 @@ def stream_trial(
         model=GPT_OSS_MODEL,
         case_summary=case_summary,
         evidence_summary=evidence_summary,
         task=(
             f"As {JUDGE_NAME}, ask one sharp hinge question that would change the outcome if answered. "
-            "Use Stoic restraint and public reason."
         ),
         provider=OPENAI_PROVIDER,
         max_tokens=100,
     )
-    yield _emit(
-        packet,
-        source_trace,
-        model_calls,
         TrialEvent(
             phase="questions",
             title="The Hinge Question",
@@ -420,24 +465,27 @@ def stream_trial(
     claimant_answer = _required_role(
         model_runner,
         model_calls,
-        agent="Advocate Auric",
         role="claimant advocate",
         model=GPT_OSS_MODEL,
         case_summary=case_summary,
         evidence_summary=evidence_summary,
-        task=f"Answer {JUDGE_NAME}'s hinge question for the claimant: {judge_question.text}",
         provider=OPENAI_PROVIDER,
         max_tokens=130,
     )
-    yield _emit(
-        packet,
-        source_trace,
-        model_calls,
         TrialEvent(
             phase="questions",
             title="Claimant Answers the Bench",
             body="The claimant answers the hinge question.",
-            turns=[_turn("Advocate Auric", "claimant advocate", claimant_answer, GPT_OSS_MODEL, 0.88)],
             evidence=packet.evidence,
         ),
         delay,
@@ -446,24 +494,27 @@ def stream_trial(
     respondent_answer = _required_role(
         model_runner,
         model_calls,
-        agent="Counsel Sable",
         role="respondent advocate",
         model=GPT_OSS_MODEL,
         case_summary=case_summary,
         evidence_summary=evidence_summary,
-        task=f"Answer {JUDGE_NAME}'s hinge question for the respondent: {judge_question.text}",
         provider=OPENAI_PROVIDER,
         max_tokens=130,
     )
-    yield _emit(
-        packet,
-        source_trace,
-        model_calls,
         TrialEvent(
             phase="questions",
             title="Respondent Answers the Bench",
             body="The respondent answers the hinge question.",
-            turns=[_turn("Counsel Sable", "respondent advocate", respondent_answer, GPT_OSS_MODEL, 0.88)],
             evidence=packet.evidence,
         ),
         delay,
@@ -474,17 +525,20 @@ def stream_trial(
         model_calls,
         agent="Nemotron Jury",
         role="juror panel",
-        model=NEMOTRON_MODEL,
-        case_summary=case_summary,
-        evidence_summary=evidence_summary,
-        task="Announce that the six named jurors retire to vote. Do not reveal the votes yet.",
         provider=NEMOTRON_PROVIDER,
         max_tokens=100,
     )
-    yield _emit(
-        packet,
-        source_trace,
-        model_calls,
         TrialEvent(
             phase="deliberation",
             title="The Jury Retires",
@@ -495,29 +549,35 @@ def stream_trial(
         delay,
     )
-    jury_votes_result = _required_role(
-        model_runner,
-        model_calls,
-        agent="Nemotron Jury",
-        role="juror vote generator",
-        model=NEMOTRON_MODEL,
-        case_summary=case_summary,
-        evidence_summary=evidence_summary,
-        task=_jury_task(),
-        provider=NEMOTRON_PROVIDER,
-        max_tokens=650,
-    )
-    votes = _parse_jury_votes(jury_votes_result, packet)
-    for vote in votes:
-        juror_result = ModelResult(
-            text=f"{vote.vote.replace('_', ' ').title()}. {vote.reason}",
-            call=jury_votes_result.call,
-            input_text=jury_votes_result.input_text,
-        )
-        yield _emit(
-            packet,
-            source_trace,
-            model_calls,
             TrialEvent(
                 phase="deliberation",
                 title=f"Juror {vote.juror} Votes",
@@ -538,18 +598,22 @@ def stream_trial(
         model=GPT_OSS_MODEL,
         case_summary=case_summary,
         evidence_summary=evidence_summary,
         task=(
             f"As {JUDGE_NAME}, announce the final legal finding after the jury vote with Stoic restraint. "
             f"Finding: {verdict.finding}. "
-            f"Jury rationale: {verdict.rationale} Remedy: {verdict.remedy}. Include uncertainty without disclaiming the role."
         ),
         provider=OPENAI_PROVIDER,
         max_tokens=160,
     )
-    yield _emit(
-        packet,
-        source_trace,
-        model_calls,
         TrialEvent(
             phase="verdict",
             title="The Court Announces Judgment",

 from pydantic import ValidationError
 from .cases import get_case
+from .llm import ModelCall, ModelCallError, ModelResult, call_small_model, clean_model_text
 from .models import AgentTurn, CasePacket, JurorVote, TrialEvent, TrialRequest, Verdict
 from .retrieval import build_live_case
 OPENBMB_PROVIDER = "featherless-ai"
 NEMOTRON_PROVIDER = "featherless-ai"
+MODEL_BUDGET = [
+    ("Presiding Advocate", GPT_OSS_MODEL, 20.0),
+    ("Clerk of Style", OPENBMB_MODEL, 4.0),
+    ("Jury Ring", NEMOTRON_MODEL, 8.0),
+]
 TOTAL_PARAMS_B = sum(item[2] for item in MODEL_BUDGET)
 JUDGE_NAME = "Marcus Aurelius"
     )
+def _case_summary(packet: CasePacket) -> str:
+    context = packet.context or packet.setting
+    return (
+        f"{packet.title}. Charge: {packet.charge}\n"
+        f"Context: {context}\n"
+        f"Claimant: {packet.claimant_claim}\n"
+        f"Respondent: {packet.respondent_claim}"
+    )
 def _evidence_summary(packet: CasePacket) -> str:
     return [call.__dict__ for call in calls]
+def resolve_case(request: TrialRequest) -> tuple[CasePacket, dict]:
+    if request.case_id == "custom":
+        if request.custom_case is None:
+            raise RuntimeError("Custom case requires trial details and evidence before the court can begin.")
+        return request.custom_case, {"mode": "custom"}
+    if request.case_id == "live":
         packet = build_live_case(request.search_query, request.hypothetical)
         if packet:
             return packet, {"mode": "live"}
     except Exception as exc:
         raise RequiredModelError(f"{kwargs.get('agent', 'Model')} unavailable: {exc}") from exc
     model_calls.append(result.call)
+    if not result.call.ok:
+        error = result.call.error or "model call did not complete"
+        raise RequiredModelError(f"{kwargs.get('agent', 'Model')} unavailable: {error}")
+    try:
+        result.text = clean_model_text(result.text)
+    except ModelCallError as exc:
+        raise RequiredModelError(f"{kwargs.get('agent', 'Model')} returned non-dialogue output: {exc}") from exc
+    if not result.text.strip():
+        raise RequiredModelError(f"{kwargs.get('agent', 'Model')} returned an empty response.")
+    return result
 def _trace(packet: CasePacket, source_trace: dict, model_calls: list[ModelCall]) -> dict:
     }
+def _emit(
     packet: CasePacket,
     source_trace: dict,
     model_calls: list[ModelCall],
     event.trace = _trace(packet, source_trace, model_calls)
     if delay > 0:
         time.sleep(delay)
+    return event
+def _record_and_emit(
+    events: list[TrialEvent],
+    packet: CasePacket,
+    source_trace: dict,
+    model_calls: list[ModelCall],
+    event: TrialEvent,
+    delay: float,
+) -> TrialEvent:
+    emitted = _emit(packet, source_trace, model_calls, event, delay)
+    events.append(emitted)
+    return emitted
+def _compact(value: str, limit: int = 420) -> str:
+    text = " ".join(value.split())
+    return text if len(text) <= limit else text[: limit - 3].rstrip() + "..."
+def _trial_history(events: list[TrialEvent]) -> str:
+    if not events:
+        return "No trial statements have been made yet."
+    lines = []
+    for index, event in enumerate(events, start=1):
+        if event.turns:
+            turn = event.turns[0]
+            lines.append(
+                f"{index}. {event.phase} / {event.title} - {turn.agent} ({turn.role}): {_compact(turn.content)}"
+            )
+        elif event.body:
+            lines.append(f"{index}. {event.phase} / {event.title}: {_compact(event.body)}")
+        for vote in event.votes:
+            lines.append(
+                f"   Vote - {vote.juror}: {vote.vote}; reason: {_compact(vote.reason, 220)}; evidence: {', '.join(vote.evidence_ids)}"
+            )
+    return "\n".join(lines)
+def _extract_json(text: str) -> object:
     stripped = text.strip()
     if stripped.startswith("```"):
         stripped = re.sub(r"^```(?:json)?\s*", "", stripped, flags=re.I)
         return json.loads(match.group(1))
+def _parse_juror_vote(result: ModelResult, packet: CasePacket, juror: str) -> JurorVote:
+    try:
+        data = _extract_json(result.text)
+    except json.JSONDecodeError as exc:
+        raise RequiredModelError(f"{juror} returned invalid JSON: {exc.msg}") from exc
+    if isinstance(data, dict) and isinstance(data.get("votes"), list):
+        if len(data["votes"]) != 1:
+            raise RequiredModelError(f"{juror} must return exactly one vote.")
+        data = data["votes"][0]
+    if not isinstance(data, dict):
+        raise RequiredModelError(f"{juror} vote output must be a JSON object.")
+    try:
+        vote = JurorVote.model_validate(data)
+    except ValidationError as exc:
+        raise RequiredModelError(f"{juror} vote schema is invalid: {exc.errors()[0]['msg']}") from exc
+    known_evidence = {item.id for item in packet.evidence}
+    expected_persona = JUROR_PERSONAS[juror]
+    if vote.juror != juror:
+        raise RequiredModelError(f"{juror} vote must use juror '{juror}'.")
+    if vote.persona.strip().lower() != expected_persona:
+        raise RequiredModelError(f"{juror} persona must be '{expected_persona}'.")
+    if not vote.reason.strip():
+        raise RequiredModelError(f"{juror} must include a rationale.")
+    if not vote.evidence_ids or any(evidence_id not in known_evidence for evidence_id in vote.evidence_ids):
+        raise RequiredModelError(f"{juror} must cite known evidence IDs.")
+    return vote
+def _majority_finding(votes: list[JurorVote]) -> str:
     counts = Counter(vote.vote for vote in votes)
     top = counts.most_common()
     if not top:
     )
+def _juror_task(juror: str, persona: str) -> str:
     return (
+        f"After watching the trial, vote as {juror}. Your worldview is: {persona}. "
+        "Return exactly one JSON object with keys juror, persona, vote, reason, and evidence_ids. "
+        "Valid vote values are liable, not_liable, uncertain. The persona value must exactly match your worldview. "
+        "The reason must be one concise sentence grounded in your beliefs and the record. Cite evidence IDs from the record."
     )
     model_runner: ModelRunner | None = None,
 ) -> Iterable[TrialEvent]:
     packet, source_trace = resolve_case(request)
+    case_summary = _case_summary(packet)
+    evidence_summary = _evidence_summary(packet)
+    model_calls: list[ModelCall] = []
+    events: list[TrialEvent] = []
+    hypo = request.hypothetical.strip()
     hypo_line = f"\n\nUser hypothetical admitted as a blue-ribbon sidebar: {hypo}" if hypo else ""
     clerk = _required_role(
         model=OPENBMB_MODEL,
         case_summary=case_summary,
         evidence_summary=evidence_summary,
+        task="Begin with 'I call'. Announce the case by name, identify the parties, and read the charge.",
         provider=OPENBMB_PROVIDER,
         max_tokens=110,
     )
+    yield _record_and_emit(
+        events,
+        packet,
+        source_trace,
+        model_calls,
         TrialEvent(
             phase="intake",
             title="The Court Convenes",
         model=GPT_OSS_MODEL,
         case_summary=case_summary,
         evidence_summary=evidence_summary,
+        trial_history=_trial_history(events),
+        persona=JUDGE_PERSONA,
+        objective="Set a fair standard for hearing both sides.",
         task=(
             f"As {JUDGE_NAME}, a Stoic courtroom judge guided by {JUDGE_PERSONA}, explain the proceeding "
+            "and the burden of proof in one or two disciplined sentences using I or we."
         ),
         provider=OPENAI_PROVIDER,
         max_tokens=110,
     )
+    yield _record_and_emit(
+        events,
+        packet,
+        source_trace,
+        model_calls,
         TrialEvent(
             phase="intake",
             title="The Burden Is Set",
     claimant_opening = _required_role(
         model_runner,
         model_calls,
+        agent="Mike OSS",
         role="claimant advocate",
         model=GPT_OSS_MODEL,
+        case_summary=case_summary,
+        evidence_summary=evidence_summary,
+        trial_history=_trial_history(events),
+        objective="Win the case for the claimant using the strongest fair reading of the record.",
+        task="Make the claimant's opening statement alone, speaking as I for the claimant. Cite the strongest claimant-side exhibit.",
         provider=OPENAI_PROVIDER,
         max_tokens=130,
     )
+    yield _record_and_emit(
+        events,
+        packet,
+        source_trace,
+        model_calls,
         TrialEvent(
             phase="claims",
             title="Claimant Opening",
             body=packet.claimant_claim,
+            turns=[_turn("Mike OSS", "claimant advocate", claimant_opening, GPT_OSS_MODEL, 0.88)],
             evidence=packet.evidence,
         ),
         delay,
     respondent_opening = _required_role(
         model_runner,
         model_calls,
+        agent="Harvey Vector",
         role="respondent advocate",
         model=GPT_OSS_MODEL,
+        case_summary=case_summary,
+        evidence_summary=evidence_summary,
+        trial_history=_trial_history(events),
+        objective="Win the case for the respondent using doubt, context, and the strongest fair reading of the record.",
+        task="Make the respondent's opening statement alone, speaking as I for the respondent. Emphasize uncertainty and cite a helpful exhibit.",
         provider=OPENAI_PROVIDER,
         max_tokens=130,
     )
+    yield _record_and_emit(
+        events,
+        packet,
+        source_trace,
+        model_calls,
         TrialEvent(
             phase="opening",
             title="Respondent Opening",
             body=packet.respondent_claim,
+            turns=[_turn("Harvey Vector", "respondent advocate", respondent_opening, GPT_OSS_MODEL, 0.88)],
             evidence=packet.evidence,
         ),
         delay,
     )
+    yield _record_and_emit(
+        events,
+        packet,
+        source_trace,
+        model_calls,
+        TrialEvent(
+            phase="evidence",
+            title="The Evidence Record",
+            body="\n".join(f"{item.id}: {item.title} | reliability {item.reliability:.2f} | {item.note}" for item in packet.evidence),
+            turns=[],
+            evidence=packet.evidence,
+        ),
+        delay,
     )
     judge_question = _required_role(
         model=GPT_OSS_MODEL,
         case_summary=case_summary,
         evidence_summary=evidence_summary,
+        trial_history=_trial_history(events),
+        persona=JUDGE_PERSONA,
+        objective="Ask the question most likely to reveal which side has met its burden.",
         task=(
             f"As {JUDGE_NAME}, ask one sharp hinge question that would change the outcome if answered. "
+            "Use Stoic restraint and public reason, speaking from the bench as I or we."
         ),
         provider=OPENAI_PROVIDER,
         max_tokens=100,
     )
+    yield _record_and_emit(
+        events,
+        packet,
+        source_trace,
+        model_calls,
         TrialEvent(
             phase="questions",
             title="The Hinge Question",
     claimant_answer = _required_role(
         model_runner,
         model_calls,
+        agent="Mike OSS",
         role="claimant advocate",
         model=GPT_OSS_MODEL,
         case_summary=case_summary,
         evidence_summary=evidence_summary,
+        trial_history=_trial_history(events),
+        objective="Answer the judge in the way most favorable to the claimant.",
+        task=f"Answer {JUDGE_NAME}'s hinge question as I for the claimant: {judge_question.text}",
         provider=OPENAI_PROVIDER,
         max_tokens=130,
     )
+    yield _record_and_emit(
+        events,
+        packet,
+        source_trace,
+        model_calls,
         TrialEvent(
             phase="questions",
             title="Claimant Answers the Bench",
             body="The claimant answers the hinge question.",
+            turns=[_turn("Mike OSS", "claimant advocate", claimant_answer, GPT_OSS_MODEL, 0.88)],
             evidence=packet.evidence,
         ),
         delay,
     respondent_answer = _required_role(
         model_runner,
         model_calls,
+        agent="Harvey Vector",
         role="respondent advocate",
         model=GPT_OSS_MODEL,
         case_summary=case_summary,
         evidence_summary=evidence_summary,
+        trial_history=_trial_history(events),
+        objective="Answer the judge in the way most favorable to the respondent.",
+        task=f"Answer {JUDGE_NAME}'s hinge question as I for the respondent: {judge_question.text}",
         provider=OPENAI_PROVIDER,
         max_tokens=130,
     )
+    yield _record_and_emit(
+        events,
+        packet,
+        source_trace,
+        model_calls,
         TrialEvent(
             phase="questions",
             title="Respondent Answers the Bench",
             body="The respondent answers the hinge question.",
+            turns=[_turn("Harvey Vector", "respondent advocate", respondent_answer, GPT_OSS_MODEL, 0.88)],
             evidence=packet.evidence,
         ),
         delay,
         model_calls,
         agent="Nemotron Jury",
         role="juror panel",
+        model=NEMOTRON_MODEL,
+        case_summary=case_summary,
+        evidence_summary=evidence_summary,
+        trial_history=_trial_history(events),
+        objective="Move the court from arguments into individual jury votes.",
+        task="Announce as we, the six named jurors, that we retire to vote. Do not reveal the votes yet.",
         provider=NEMOTRON_PROVIDER,
         max_tokens=100,
     )
+    yield _record_and_emit(
+        events,
+        packet,
+        source_trace,
+        model_calls,
         TrialEvent(
             phase="deliberation",
             title="The Jury Retires",
         delay,
     )
+    votes: list[JurorVote] = []
+    for juror, persona in JUROR_PERSONAS.items():
+        juror_vote_result = _required_role(
+            model_runner,
+            model_calls,
+            agent=juror,
+            role="juror",
+            model=NEMOTRON_MODEL,
+            case_summary=case_summary,
+            evidence_summary=evidence_summary,
+            trial_history=_trial_history(events),
+            persona=persona,
+            objective="Reach the verdict this historical worldview would consider right after watching the trial.",
+            task=_juror_task(juror, persona),
+            provider=NEMOTRON_PROVIDER,
+            max_tokens=220,
+        )
+        vote = _parse_juror_vote(juror_vote_result, packet, juror)
+        votes.append(vote)
+        juror_result = ModelResult(
+            text=f"I vote {vote.vote.replace('_', ' ').title()}. {vote.reason}",
+            call=juror_vote_result.call,
+            input_text=juror_vote_result.input_text,
+        )
+        yield _record_and_emit(
+            events,
+            packet,
+            source_trace,
+            model_calls,
             TrialEvent(
                 phase="deliberation",
                 title=f"Juror {vote.juror} Votes",
         model=GPT_OSS_MODEL,
         case_summary=case_summary,
         evidence_summary=evidence_summary,
+        trial_history=_trial_history(events),
+        persona=JUDGE_PERSONA,
+        objective="Announce the jury result fairly, summarize both sides, and do not override the jury.",
         task=(
             f"As {JUDGE_NAME}, announce the final legal finding after the jury vote with Stoic restraint. "
             f"Finding: {verdict.finding}. "
+            f"Jury rationale: {verdict.rationale} Remedy: {verdict.remedy}. Speak as I from the bench and include uncertainty without disclaiming the role."
         ),
         provider=OPENAI_PROVIDER,
         max_tokens=160,
     )
+    yield _record_and_emit(
+        events,
+        packet,
+        source_trace,
+        model_calls,
         TrialEvent(
             phase="verdict",
             title="The Court Announces Judgment",

sovereign_bench/llm.py CHANGED Viewed

@@ -69,6 +69,21 @@ def _response_text(response: object) -> str:
     return ""
 def clean_model_text(text: str) -> str:
     cleaned = re.sub(r"(?is)<think>.*?</think>", "", text).strip()
     if re.search(r"(?i)<think>", cleaned):
@@ -76,6 +91,26 @@ def clean_model_text(text: str) -> str:
     cleaned = re.sub(r"(?is)<analysis>.*?</analysis>", "", cleaned).strip()
     cleaned = re.sub(r"(?is)<reasoning>.*?</reasoning>", "", cleaned).strip()
     cleaned = cleaned.replace("</think>", "").strip()
     if not cleaned:
         raise ModelCallError("model returned no visible output")
     return cleaned
@@ -108,7 +143,9 @@ def call_hf_chat_model(
                 "role": "user",
                 "content": (
                     "Your previous response did not include visible courtroom dialogue. "
-                    "Return only the final spoken dialogue now. Do not include <think>, analysis, reasoning, markdown, or notes. /no_think"
                 ),
             }
         ]
@@ -166,6 +203,9 @@ def call_small_model(
     case_summary: str,
     task: str,
     evidence_summary: str,
     provider: str = "auto",
     max_tokens: int = 120,
 ) -> ModelResult:
@@ -175,6 +215,9 @@ def call_small_model(
         case_summary=case_summary,
         task=task,
         evidence_summary=evidence_summary,
     )
     result = call_hf_chat_model(
         model=model,
@@ -193,17 +236,61 @@ def build_role_messages(
     case_summary: str,
     task: str,
     evidence_summary: str,
 ) -> list[dict[str, str]]:
     system = (
         "You are one AI character in Sovereign Bench, a miniature virtual courtroom. "
-        "Write concise courtroom dialogue only. Cite evidence IDs when relevant. "
         "Do not claim certainty beyond the record. Do not add markdown. "
-        "Return final spoken dialogue only; never reveal hidden reasoning, analysis, or <think> text. "
         "Do not use thinking mode."
     )
     user = (
         f"Agent: {agent}\nRole: {role}\nCase:\n{case_summary}\n\n"
-        f"Evidence:\n{evidence_summary}\n\nTask: {task}\n"
-        "Answer in 1-3 sentences, theatrical but clear.\n/no_think"
     )
     return [{"role": "system", "content": system}, {"role": "user", "content": user}]

     return ""
+INSTRUCTION_ECHO_RE = re.compile(
+    r"(?is)\b("
+    r"as requested|"
+    r"first[- ]person|"
+    r"pronoun|"
+    r"1\s*-\s*3 sentences|"
+    r"theatrical but clear|"
+    r"i will speak as|"
+    r"i will now (?:announce|answer|respond|deliver|speak)|"
+    r"as the assigned agent|"
+    r"the task"
+    r")\b"
+)
 def clean_model_text(text: str) -> str:
     cleaned = re.sub(r"(?is)<think>.*?</think>", "", text).strip()
     if re.search(r"(?i)<think>", cleaned):
     cleaned = re.sub(r"(?is)<analysis>.*?</analysis>", "", cleaned).strip()
     cleaned = re.sub(r"(?is)<reasoning>.*?</reasoning>", "", cleaned).strip()
     cleaned = cleaned.replace("</think>", "").strip()
+    channel_match = re.search(r"(?ims)^\s*(?:final|assistant_final)\s*:?\s*(.+)\Z", cleaned)
+    if channel_match:
+        cleaned = channel_match.group(1).strip()
+    else:
+        final_after_analysis = re.search(
+            r"(?ims)^\s*(?:analysis|reasoning|assistant_analysis)\s*:?.*?^\s*(?:final|assistant_final)\s*:?\s*(.+)\Z",
+            cleaned,
+        )
+        if final_after_analysis:
+            cleaned = final_after_analysis.group(1).strip()
+        elif re.search(r"(?im)^\s*(?:analysis|reasoning|assistant_analysis)\s*:?", cleaned):
+            raise ModelCallError("model returned hidden analysis instead of courtroom dialogue")
+    if re.search(r"(?i)\b(?:analysis|reasoning)\s*:", cleaned[:80]):
+        raise ModelCallError("model returned hidden analysis instead of courtroom dialogue")
+    if INSTRUCTION_ECHO_RE.search(cleaned[:420]):
+        pieces = [piece.strip() for piece in re.split(r"\n\s*\n", cleaned) if piece.strip()]
+        dialogue_pieces = [piece for piece in pieces if not INSTRUCTION_ECHO_RE.search(piece)]
+        if not dialogue_pieces:
+            raise ModelCallError("model echoed instructions instead of courtroom dialogue")
+        cleaned = "\n\n".join(dialogue_pieces).strip()
     if not cleaned:
         raise ModelCallError("model returned no visible output")
     return cleaned
                 "role": "user",
                 "content": (
                     "Your previous response did not include visible courtroom dialogue. "
+                    "Return only the final first-person spoken dialogue now, as the assigned agent. "
+                    "Do not mention prompts, tasks, requirements, pronouns, sentence counts, or that you are following instructions. "
+                    "Do not include <think>, analysis, reasoning, markdown, narration, or notes. /no_think"
                 ),
             }
         ]
     case_summary: str,
     task: str,
     evidence_summary: str,
+    trial_history: str = "",
+    persona: str = "",
+    objective: str = "",
     provider: str = "auto",
     max_tokens: int = 120,
 ) -> ModelResult:
         case_summary=case_summary,
         task=task,
         evidence_summary=evidence_summary,
+        trial_history=trial_history,
+        persona=persona,
+        objective=objective,
     )
     result = call_hf_chat_model(
         model=model,
     case_summary: str,
     task: str,
     evidence_summary: str,
+    trial_history: str = "",
+    persona: str = "",
+    objective: str = "",
 ) -> list[dict[str, str]]:
+    vote_role = role == "juror"
+    dialogue_role = not vote_role
     system = (
         "You are one AI character in Sovereign Bench, a miniature virtual courtroom. "
+        "Stay fully in character as the assigned Agent and Role. "
+        "Use the case facts and evidence provided below; cite evidence IDs when relevant. "
         "Do not claim certainty beyond the record. Do not add markdown. "
+        "Never reveal hidden reasoning, analysis, or <think> text. "
         "Do not use thinking mode."
     )
+    if role in {"claimant advocate", "respondent advocate"}:
+        system += (
+            " You are a lawyer trying to win for your side. Use the evidence, the other side's claims, "
+            "and the trial record to make the strongest fair argument available."
+        )
+    elif role in {"judge", "verdict writer"}:
+        system += (
+            " You are a fair judge. Consider both sides, the evidence, and the trial record. "
+            "At verdict, announce and contextualize the jury result rather than replacing it with your own preferred outcome."
+        )
+    elif role == "juror":
+        system += (
+            " You are an individual juror. Decide through your named worldview and the trial transcript, "
+            "not a generic juror role. Output only valid JSON for your vote."
+        )
+    elif role == "juror panel":
+        system += " You speak for the jury panel procedurally; do not reveal votes before deliberation."
+    elif role == "clerk":
+        system += " You are a procedural courtroom role; present the record clearly without deciding the verdict."
+    if dialogue_role:
+        system += (
+            " Output only the words this character says aloud in court. "
+            "Use I, me, my, we, or our naturally when the role calls for it. "
+            "Do not narrate about yourself in the third person. Do not summarize what the agent would say."
+        )
+        answer_instruction = (
+            f"Speak as {agent}. Give only the in-scene court line, 1-3 concise sentences."
+        )
+    else:
+        answer_instruction = (
+            "Return only the requested JSON object. "
+            "Do not add dialogue, markdown, or commentary."
+        )
+    persona_block = f"\nPersona / worldview:\n{persona}\n" if persona else ""
+    objective_block = f"\nObjective:\n{objective}\n" if objective else ""
+    history_block = f"\nTrial history so far:\n{trial_history}\n" if trial_history else ""
     user = (
         f"Agent: {agent}\nRole: {role}\nCase:\n{case_summary}\n\n"
+        f"Evidence:\n{evidence_summary}\n"
+        f"{persona_block}{objective_block}{history_block}\nTask: {task}\n"
+        f"{answer_instruction}\n/no_think"
     )
     return [{"role": "system", "content": system}, {"role": "user", "content": user}]

sovereign_bench/models.py CHANGED Viewed

@@ -35,6 +35,7 @@ class CasePacket(BaseModel):
     respondent: str
     charge: str
     setting: str
     claimant_claim: str
     respondent_claim: str
     source_note: str
@@ -45,6 +46,7 @@ class TrialRequest(BaseModel):
     case_id: str = "socrates"
     search_query: str = ""
     hypothetical: str = ""
     speed: Literal["swift", "measured", "ceremonial"] = "swift"
     mind_layer: bool = True

     respondent: str
     charge: str
     setting: str
+    context: str = ""
     claimant_claim: str
     respondent_claim: str
     source_note: str
     case_id: str = "socrates"
     search_query: str = ""
     hypothetical: str = ""
+    custom_case: CasePacket | None = None
     speed: Literal["swift", "measured", "ceremonial"] = "swift"
     mind_layer: bool = True

tests/test_cases.py CHANGED Viewed

@@ -2,7 +2,15 @@ from sovereign_bench.cases import CASES
 def test_cached_cases_have_evidence():
-    assert {"socrates", "barnaby"} <= set(CASES)
     for case in CASES.values():
         assert len(case.evidence) >= 4
         assert all(item.id and item.excerpt for item in case.evidence)

 def test_cached_cases_have_evidence():
+    assert {"socrates", "greg", "barnaby"} <= set(CASES)
     for case in CASES.values():
         assert len(case.evidence) >= 4
         assert all(item.id and item.excerpt for item in case.evidence)
+def test_demo_cases_have_book_context_and_three_items_per_side():
+    for case_id in ["socrates", "greg"]:
+        case = CASES[case_id]
+        assert case.context
+        assert len([item for item in case.evidence if item.supports == "claimant"]) >= 3
+        assert len([item for item in case.evidence if item.supports == "respondent"]) >= 3

tests/test_engine.py CHANGED Viewed

@@ -3,39 +3,36 @@ import re
 import pytest
-from sovereign_bench.engine import JUDGE_NAME, JUROR_PERSONAS, RequiredModelError, run_trial
-from sovereign_bench.llm import ModelCall, ModelResult
-from sovereign_bench.models import TrialRequest
-def _jury_json(evidence_summary: str, vote: str = "liable") -> str:
-    evidence_ids = re.findall(r"^([A-Z]+-E\d+):", evidence_summary, flags=re.M)
-    evidence_ids = (evidence_ids or ["SOC-E1"]) * 6
     return json.dumps(
         {
-            "votes": [
-                {
-                    "juror": name,
-                    "persona": persona,
-                    "vote": vote if idx < 4 else "not_liable",
-                    "reason": f"{name} applies a {persona} lens to exhibit {evidence_ids[idx]}.",
-                    "evidence_ids": [evidence_ids[idx]],
-                }
-                for idx, (name, persona) in enumerate(JUROR_PERSONAS.items())
-            ]
         }
     )
 def fake_model_runner(**kwargs):
     text = (
-        _jury_json(kwargs["evidence_summary"])
-        if kwargs["role"] == "juror vote generator"
         else f"{kwargs['agent']} responds to: {kwargs['task']}"
     )
     prompt = (
         f"SYSTEM:\nFake live model for tests.\n\nUSER:\n"
-        f"Agent: {kwargs['agent']}\nRole: {kwargs['role']}\nTask: {kwargs['task']}\n\nASSISTANT:\n"
     )
     return ModelResult(
         text=text,
@@ -54,12 +51,11 @@ def test_cached_cases_emit_sequential_speaker_order():
     expected_speakers = [
         "Clerk Meridian",
         JUDGE_NAME,
-        "Advocate Auric",
-        "Counsel Sable",
-        "Auditor Prism",
         JUDGE_NAME,
-        "Advocate Auric",
-        "Counsel Sable",
         "Nemotron Jury",
         *list(JUROR_PERSONAS),
         JUDGE_NAME,
@@ -67,7 +63,10 @@ def test_cached_cases_emit_sequential_speaker_order():
     for case_id in ["socrates", "barnaby"]:
         events = run_trial(TrialRequest(case_id=case_id), model_runner=fake_model_runner)
-        assert [event.turns[0].agent for event in events] == expected_speakers
         assert [event.phase for event in events].count("deliberation") == 7
         assert events[0].turns[0].input
         assert "SYSTEM:" in events[0].turns[0].input
@@ -81,12 +80,12 @@ def test_no_event_contains_both_lawyers_speaking_together():
     for event in events:
         agents = {turn.agent for turn in event.turns}
-        assert not {"Advocate Auric", "Counsel Sable"}.issubset(agents)
 def test_juror_vote_events_have_fixed_personas_and_evidence():
     events = run_trial(TrialRequest(case_id="socrates"), model_runner=fake_model_runner)
-    juror_events = [event for event in events if event.turns[0].agent in JUROR_PERSONAS]
     assert len(juror_events) == 6
     for event in juror_events:
@@ -94,6 +93,7 @@ def test_juror_vote_events_have_fixed_personas_and_evidence():
         assert vote.juror == event.turns[0].agent
         assert vote.persona == JUROR_PERSONAS[vote.juror]
         assert vote.vote in {"liable", "not_liable", "uncertain"}
         assert vote.reason
         assert vote.evidence_ids
@@ -102,6 +102,95 @@ def test_juror_vote_events_have_fixed_personas_and_evidence():
     assert [vote.juror for vote in final.votes] == list(JUROR_PERSONAS)
 def test_jury_contract_uses_public_history_personas():
     assert JUDGE_NAME == "Marcus Aurelius"
     assert JUROR_PERSONAS == {
@@ -114,6 +203,94 @@ def test_jury_contract_uses_public_history_personas():
     }
 def test_required_model_failure_stops_trial_without_canned_dialogue():
     def failing_runner(**kwargs):
         return ModelResult(
@@ -136,7 +313,7 @@ def test_required_model_failure_stops_trial_without_canned_dialogue():
 def test_invalid_jury_output_stops_trial_without_fallback_votes():
     def invalid_jury_runner(**kwargs):
         result = fake_model_runner(**kwargs)
-        if kwargs["role"] == "juror vote generator":
             result.text = "the jury refuses structured output"
         return result

 import pytest
+from sovereign_bench.engine import JUDGE_NAME, JUROR_PERSONAS, RequiredModelError, run_trial, stream_trial
+from sovereign_bench.llm import ModelCall, ModelResult, build_role_messages, clean_model_text
+from sovereign_bench.models import CasePacket, EvidenceItem, TrialRequest
+def _juror_json(kwargs, vote: str = "liable") -> str:
+    evidence_ids = re.findall(r"^([A-Z]+-[A-Z]\d+):", kwargs["evidence_summary"], flags=re.M)
+    evidence_id = (evidence_ids or ["SOC-E1"])[0]
     return json.dumps(
         {
+            "juror": kwargs["agent"],
+            "persona": kwargs["persona"],
+            "vote": vote,
+            "reason": f"{kwargs['agent']} applies {kwargs['persona']} to exhibit {evidence_id}.",
+            "evidence_ids": [evidence_id],
         }
     )
 def fake_model_runner(**kwargs):
     text = (
+        _juror_json(kwargs, vote="liable" if list(JUROR_PERSONAS).index(kwargs["agent"]) < 4 else "not_liable")
+        if kwargs["role"] == "juror"
         else f"{kwargs['agent']} responds to: {kwargs['task']}"
     )
     prompt = (
         f"SYSTEM:\nFake live model for tests.\n\nUSER:\n"
+        f"Agent: {kwargs['agent']}\nRole: {kwargs['role']}\n"
+        f"Persona: {kwargs.get('persona', '')}\nObjective: {kwargs.get('objective', '')}\n"
+        f"History: {kwargs.get('trial_history', '')}\nTask: {kwargs['task']}\n\nASSISTANT:\n"
     )
     return ModelResult(
         text=text,
     expected_speakers = [
         "Clerk Meridian",
         JUDGE_NAME,
+        "Mike OSS",
+        "Harvey Vector",
         JUDGE_NAME,
+        "Mike OSS",
+        "Harvey Vector",
         "Nemotron Jury",
         *list(JUROR_PERSONAS),
         JUDGE_NAME,
     for case_id in ["socrates", "barnaby"]:
         events = run_trial(TrialRequest(case_id=case_id), model_runner=fake_model_runner)
+        assert [event.turns[0].agent for event in events if event.turns] == expected_speakers
+        evidence_event = next(event for event in events if event.phase == "evidence")
+        assert evidence_event.title == "The Evidence Record"
+        assert evidence_event.turns == []
         assert [event.phase for event in events].count("deliberation") == 7
         assert events[0].turns[0].input
         assert "SYSTEM:" in events[0].turns[0].input
     for event in events:
         agents = {turn.agent for turn in event.turns}
+        assert not {"Mike OSS", "Harvey Vector"}.issubset(agents)
 def test_juror_vote_events_have_fixed_personas_and_evidence():
     events = run_trial(TrialRequest(case_id="socrates"), model_runner=fake_model_runner)
+    juror_events = [event for event in events if event.turns and event.turns[0].agent in JUROR_PERSONAS]
     assert len(juror_events) == 6
     for event in juror_events:
         assert vote.juror == event.turns[0].agent
         assert vote.persona == JUROR_PERSONAS[vote.juror]
         assert vote.vote in {"liable", "not_liable", "uncertain"}
+        assert event.turns[0].content.startswith("I vote ")
         assert vote.reason
         assert vote.evidence_ids
     assert [vote.juror for vote in final.votes] == list(JUROR_PERSONAS)
+def test_jurors_are_called_independently_with_personas_and_trial_history():
+    calls = []
+    def recording_runner(**kwargs):
+        calls.append(kwargs.copy())
+        return fake_model_runner(**kwargs)
+    run_trial(TrialRequest(case_id="socrates"), model_runner=recording_runner)
+    juror_calls = [call for call in calls if call["role"] == "juror"]
+    assert [call["agent"] for call in juror_calls] == list(JUROR_PERSONAS)
+    assert len(juror_calls) == 6
+    for call in juror_calls:
+        assert call["persona"] == JUROR_PERSONAS[call["agent"]]
+        assert "Claimant Opening" in call["trial_history"]
+        assert "Respondent Opening" in call["trial_history"]
+        assert "The Evidence Record" in call["trial_history"]
+        assert "historical worldview" in call["objective"]
+def test_lawyers_and_judge_receive_trial_history_and_objectives():
+    calls = []
+    def recording_runner(**kwargs):
+        calls.append(kwargs.copy())
+        return fake_model_runner(**kwargs)
+    run_trial(TrialRequest(case_id="socrates"), model_runner=recording_runner)
+    claimant_answer = next(call for call in calls if call["agent"] == "Mike OSS" and "hinge question" in call["task"])
+    respondent_answer = next(call for call in calls if call["agent"] == "Harvey Vector" and "hinge question" in call["task"])
+    verdict_call = next(call for call in calls if call["role"] == "verdict writer")
+    assert "The Hinge Question" in claimant_answer["trial_history"]
+    assert "The Hinge Question" in respondent_answer["trial_history"]
+    assert "most favorable to the claimant" in claimant_answer["objective"]
+    assert "most favorable to the respondent" in respondent_answer["objective"]
+    assert all(name in verdict_call["trial_history"] for name in JUROR_PERSONAS)
+    assert "do not override the jury" in verdict_call["objective"]
+def test_custom_case_context_and_evidence_reach_lawyer_prompts():
+    custom = CasePacket(
+        id="custom",
+        title="Custom Trial",
+        subtitle="Entered by user.",
+        claimant="Claimant",
+        respondent="Respondent",
+        charge="Whether the custom record favors the claimant.",
+        setting="A custom courtroom.",
+        context="A bicycle disappeared after a disputed garage visit.",
+        claimant_claim="The claimant says the visit explains the missing bicycle.",
+        respondent_claim="The respondent says the timing and evidence are ambiguous.",
+        source_note="Custom test packet.",
+        evidence=[
+            EvidenceItem(
+                id="CUS-F1",
+                title="Garage Text",
+                source="Custom",
+                excerpt="The respondent asked to enter the garage.",
+                supports="claimant",
+                reliability=0.65,
+                note="Supports access.",
+            ),
+            EvidenceItem(
+                id="CUS-A1",
+                title="Neighbor Sighting",
+                source="Custom",
+                excerpt="A neighbor saw the bicycle later that day.",
+                supports="respondent",
+                reliability=0.65,
+                note="Supports alternative timing.",
+            ),
+        ],
+    )
+    calls = []
+    def recording_runner(**kwargs):
+        calls.append(kwargs.copy())
+        return fake_model_runner(**kwargs)
+    run_trial(TrialRequest(case_id="custom", custom_case=custom), model_runner=recording_runner)
+    claimant_opening = next(call for call in calls if call["agent"] == "Mike OSS" and call["role"] == "claimant advocate")
+    assert "A bicycle disappeared" in claimant_opening["case_summary"]
+    assert "CUS-F1" in claimant_opening["evidence_summary"]
+    assert "CUS-A1" in claimant_opening["evidence_summary"]
 def test_jury_contract_uses_public_history_personas():
     assert JUDGE_NAME == "Marcus Aurelius"
     assert JUROR_PERSONAS == {
     }
+def test_role_prompt_requires_first_person_in_character_speech():
+    messages = build_role_messages(
+        agent="Harvey Vector",
+        role="respondent advocate",
+        case_summary="A short case summary.",
+        evidence_summary="SOC-E1: A record excerpt.",
+        task="Answer the bench for the respondent.",
+    )
+    system = messages[0]["content"]
+    user = messages[1]["content"]
+    assert "Stay fully in character as the assigned Agent and Role." in system
+    assert "Output only the words this character says aloud in court." in system
+    assert "Do not narrate about yourself in the third person." in system
+    assert "Use the case facts and evidence provided below" in system
+    assert "Speak as Harvey Vector." in user
+    assert "Give only the in-scene court line" in user
+    assert "SOC-E1" in user
+def test_juror_vote_prompt_uses_persona_history_and_json_contract():
+    messages = build_role_messages(
+        agent="Karl Marx",
+        role="juror",
+        case_summary="A short case summary.",
+        evidence_summary="SOC-E1: A record excerpt.",
+        trial_history="Mike OSS argued from SOC-E1.",
+        persona=JUROR_PERSONAS["Karl Marx"],
+        objective="Vote as Karl Marx would after watching the trial.",
+        task="Return one juror vote as JSON.",
+    )
+    system = messages[0]["content"]
+    user = messages[1]["content"]
+    assert "Output only the words this character says aloud in court." not in messages[0]["content"]
+    assert "You are an individual juror." in system
+    assert JUROR_PERSONAS["Karl Marx"] in user
+    assert "Mike OSS argued from SOC-E1." in user
+    assert "Return only the requested JSON object." in user
+def test_model_cleaner_extracts_final_speech_after_analysis_channel():
+    text = clean_model_text(
+        "analysis\nI should reason about the case first.\n\nfinal\nI stand for the respondent, and SOC-E1 leaves doubt."
+    )
+    assert text == "I stand for the respondent, and SOC-E1 leaves doubt."
+    assert "analysis" not in text.lower()
+def test_model_cleaner_rejects_visible_analysis_without_final_speech():
+    def analysis_runner(**kwargs):
+        return ModelResult(
+            text="analysis: I should think through the case before answering.",
+            input_text="SYSTEM:\nanalysis leak",
+            call=ModelCall(
+                model=kwargs["model"],
+                provider=kwargs.get("provider", "test"),
+                ok=True,
+                latency_ms=1,
+                prompt_hash="test-prompt",
+            ),
+        )
+    with pytest.raises(RequiredModelError):
+        next(stream_trial(TrialRequest(case_id="socrates"), model_runner=analysis_runner))
+def test_model_cleaner_removes_instruction_echo_when_dialogue_remains():
+    text = clean_model_text(
+        "I will now announce the case as requested, while maintaining the theatrical but clear tone required. "
+        "I will speak as Clerk Meridian in first person, starting with a pronoun.\n\n"
+        "I call The Polis v. Socrates before this court."
+    )
+    assert text == "I call The Polis v. Socrates before this court."
+def test_model_cleaner_rejects_instruction_echo_without_dialogue():
+    with pytest.raises(Exception, match="echoed instructions"):
+        clean_model_text(
+            "I will now announce the case as requested, while maintaining the theatrical but clear tone required. "
+            "I will speak as Clerk Meridian in first person, starting with a pronoun."
+        )
 def test_required_model_failure_stops_trial_without_canned_dialogue():
     def failing_runner(**kwargs):
         return ModelResult(
 def test_invalid_jury_output_stops_trial_without_fallback_votes():
     def invalid_jury_runner(**kwargs):
         result = fake_model_runner(**kwargs)
+        if kwargs["role"] == "juror":
             result.text = "the jury refuses structured output"
         return result

tests/test_ui_rendering.py CHANGED Viewed

@@ -1,10 +1,11 @@
 import inspect
 from pathlib import Path
 from PIL import Image
 import app
-from sovereign_bench.models import AgentTurn, EvidenceItem, JurorVote, TrialEvent
 OLD_CARD_CLASSES = [
@@ -71,6 +72,32 @@ def _speaker_event(agent: str, phase: str = "questions") -> TrialEvent:
     )
 def test_lower_tab_renderers_emit_plain_text_classes():
     event = _event_with_lower_tab_data()
     html = "\n".join(
@@ -101,6 +128,12 @@ def test_download_controls_are_not_wired_into_app():
     assert "Download agent trace" not in source
 def test_courtroom_splits_six_jurors_between_side_benches():
     html = app.render_court([_event_with_lower_tab_data()], started=True)
@@ -131,10 +164,15 @@ def test_courtroom_renders_historical_judge_and_juror_assets():
     assert "Marcus Aurelius" in html
     assert "assets/characters/marcus-aurelius.png" in html
     for name, image in app.JUROR_IMAGES.items():
         assert name in html
         assert image in html
     assert html.count("class='juror-portrait'") == 6
 def test_courtroom_renders_foreground_fences_and_judge_table_above_characters():
@@ -146,6 +184,82 @@ def test_courtroom_renders_foreground_fences_and_judge_table_above_characters():
     assert ".foreground-props {\n  position: absolute;\n  inset: 0;\n  z-index: 13;" in app.CSS
     assert ".puppet {\n  --skin: #c99257;" in app.CSS
     assert "z-index: 8;" in app.CSS
 def test_foreground_prop_assets_have_real_transparency():
@@ -161,13 +275,67 @@ def test_foreground_prop_assets_have_real_transparency():
 def test_latest_speaker_sets_stage_class_and_speech_bubble():
-    html = app.render_court([_speaker_event("Advocate Auric", phase="claims")], started=True)
     assert "speaker-auric" in html
-    assert "class='speech-bubble'" in html
-    assert "Advocate Auric has the visible floor." in html
     assert "puppet auric active walking" in html
     assert "puppet sable active" not in html
 def test_individual_juror_can_be_active_speaker():
@@ -199,7 +367,19 @@ def test_individual_juror_can_be_active_speaker():
     assert "speaker-karl-marx" in html
     assert "<a class='juror active'" in html
     assert "Liable. E1 carries the record." in html
 def test_lawyer_movement_css_is_speaker_specific_not_phase_wide():
@@ -209,27 +389,106 @@ def test_lawyer_movement_css_is_speaker_specific_not_phase_wide():
     assert ".phase-opening .puppet.sable" not in app.CSS
-def test_closed_book_is_smaller_and_key_characters_are_lowered():
-    assert ".episode-book.closed {\n  top: 61%;\n  width: min(163px, 20vw);" in app.CSS
-    assert ".puppet.judge {\n  left: 50%;\n  top: 56%;" in app.CSS
     assert ".puppet.auric {\n  left: 24%;\n  top: 87%;" in app.CSS
-    assert ".speaker-auric .puppet.auric {\n  left: 43%;\n  top: 91%;" in app.CSS
-    assert ".puppet.auditor {\n  left: 71%;\n  top: 80%;" in app.CSS
-    assert ".episode-book.closed {\n    top: 750px;\n    width: 140px;" in app.CSS
-    assert ".puppet.judge {\n    top: 717px;" in app.CSS
     assert ".puppet.auric {\n    left: 20%;\n    top: 970px;" in app.CSS
-    assert ".puppet.auditor {\n    left: 78%;\n    top: 860px;" in app.CSS
 def test_run_ui_yields_five_outputs_without_download_status(monkeypatch):
     event = _event_with_lower_tab_data()
     monkeypatch.setattr(app, "get_events", lambda request: iter([event]))
-    outputs = list(app.run_ui("Trial of Socrates", "", "", "swift", True))
     assert outputs
     assert all(len(output) == 5 for output in outputs)
-    assert outputs[1][-1] == "Step 1: Jury weighs the record"
     assert outputs[-1][-1] == "Verdict sealed."
     assert "download" not in outputs[-1][-1].lower()
@@ -241,12 +500,48 @@ def test_run_ui_stops_with_model_unavailable_error(monkeypatch):
     monkeypatch.setattr(app, "get_events", broken_events)
-    outputs = list(app.run_ui("Trial of Socrates", "", "", "swift", True))
     assert outputs[-1][-1] == "Model response required. Trial stopped: Marcus Aurelius unavailable: offline"
     assert "Claimant score" not in outputs[-1][0]
 def test_court_renders_sound_toggle():
     html = app.render_court([])

 import inspect
+import json
 from pathlib import Path
 from PIL import Image
 import app
+from sovereign_bench.models import AgentTurn, EvidenceItem, JurorVote, TrialEvent, Verdict
 OLD_CARD_CLASSES = [
     )
+def _verdict_event(finding: str = "liable") -> TrialEvent:
+    return TrialEvent(
+        phase="verdict",
+        title="The Court Announces Judgment",
+        body="Judgment is announced.",
+        verdict=Verdict(
+            finding=finding,
+            decree="The court enters the final judgment.",
+            rationale="The jury majority decides the record.",
+            evidence_ids=["E1"],
+            uncertainty="Some uncertainty remains.",
+            remedy="Record the judgment.",
+        ),
+        turns=[
+            AgentTurn(
+                agent=app.JUDGE_NAME,
+                role="verdict writer",
+                content="The judgment of the court is guilty.",
+                model="test-model",
+                confidence=0.9,
+                input="SYSTEM:\nAnnounce verdict.",
+            )
+        ],
+    )
 def test_lower_tab_renderers_emit_plain_text_classes():
     event = _event_with_lower_tab_data()
     html = "\n".join(
     assert "Download agent trace" not in source
+def test_case_dropdown_only_exposes_demo_and_custom_cases():
+    assert list(app.CASE_OPTIONS) == ["Trial of Socrates", "Greg Heffley vs Mom", "Custom"]
+    assert "The People v. Barnaby Buttons" not in app.CASE_OPTIONS
+    assert "Live Search Tribunal" not in app.CASE_OPTIONS
 def test_courtroom_splits_six_jurors_between_side_benches():
     html = app.render_court([_event_with_lower_tab_data()], started=True)
     assert "Marcus Aurelius" in html
     assert "assets/characters/marcus-aurelius.png" in html
+    assert "<img class='puppet-portrait' src='/gradio_api/file=assets/characters/marcus-aurelius.png'" in html
+    assert ".puppet.judge::before,\n.puppet.judge::after {\n  display: none;\n}" in app.CSS
+    assert ".puppet.judge .mouth {\n  display: none;\n}" in app.CSS
     for name, image in app.JUROR_IMAGES.items():
         assert name in html
         assert image in html
     assert html.count("class='juror-portrait'") == 6
+    assert "class='juror-face'" not in html
+    assert "class='juror-body'" not in html
 def test_courtroom_renders_foreground_fences_and_judge_table_above_characters():
     assert ".foreground-props {\n  position: absolute;\n  inset: 0;\n  z-index: 13;" in app.CSS
     assert ".puppet {\n  --skin: #c99257;" in app.CSS
     assert "z-index: 8;" in app.CSS
+    assert ".puppet.clerk {\n  left: 43%;\n  top: 66%;\n  z-index: 14;" in app.CSS
+def test_trial_progress_defaults_to_pretrial_and_renders_all_stages():
+    html = app.render_court([])
+    assert "class='trial-progress'" in html
+    assert "data-phase='pretrial' aria-current='step'" in html
+    for _key, label in app.TRIAL_PROGRESS_STAGES:
+        assert label in html
+def test_trial_progress_marks_questions_current():
+    html = app.render_court([_speaker_event("Mike OSS", phase="questions")], started=True)
+    assert "class='trial-progress-segment current' data-phase='questions' aria-current='step'" in html
+    assert "data-phase='evidence'" in html
+def test_trial_progress_marks_deliberation_current():
+    html = app.render_court([_event_with_lower_tab_data()], started=True)
+    assert "class='trial-progress-segment current' data-phase='deliberation' aria-current='step'" in html
+    assert "class='trial-progress-segment complete' data-phase='questions'" in html
+def test_trial_progress_marks_verdict_current_and_complete():
+    html = app.render_court([_speaker_event(app.JUDGE_NAME, phase="verdict")], started=True)
+    assert "class='trial-progress-segment current complete' data-phase='verdict' aria-current='step'" in html
+    assert "class='trial-progress-segment complete' data-phase='deliberation'" in html
+def test_verdict_popup_renders_only_when_final_verdict_is_revealed():
+    event = _verdict_event("liable")
+    announcement = app.render_court([event], started=True)
+    sealed = app.render_court([event], started=True, show_verdict_popup=True)
+    assert "class='speech-bubble active-dialogue speaker-judge'" in announcement
+    assert "class='verdict-popup'" not in announcement
+    assert "class='speech-bubble active-dialogue speaker-judge'" in sealed
+    assert "class='verdict-popup'" in sealed
+    assert "data-finding='liable'" in sealed
+    assert "Verdict: Guilty" in sealed
+def test_run_ui_reveals_verdict_popup_after_judge_speech(monkeypatch):
+    event = _verdict_event("not_liable")
+    monkeypatch.setattr(app, "get_events", lambda request: iter([event]))
+    monkeypatch.setattr(app, "_reading_duration", lambda text: 0)
+    outputs = list(app.run_ui("Trial of Socrates", "", "", "", "swift", True))
+    assert "class='speech-bubble active-dialogue speaker-judge'" in outputs[1][0]
+    assert "class='verdict-popup'" not in outputs[1][0]
+    assert outputs[-1][-1] == "Verdict sealed."
+    assert "class='verdict-popup'" in outputs[-1][0]
+    assert "Verdict: Not Guilty" in outputs[-1][0]
+def test_trial_progress_ignores_unknown_phase_without_extra_segment():
+    html = app.render_court([_speaker_event("Clerk Meridian", phase="appeal")], started=True)
+    assert "class='trial-progress'" in html
+    assert html.count("class='trial-progress-segment") == len(app.TRIAL_PROGRESS_STAGES)
+    assert "aria-current='step'" not in html
+    assert "class='trial-progress-segment' data-phase='appeal'" not in html
+def test_trial_progress_css_is_fixed_and_translucent_theme_matched():
+    assert ".trial-progress {\n  position: fixed;\n  top: 0;" in app.CSS
+    assert "background: rgba(23, 13, 8, .58);" in app.CSS
+    assert "backdrop-filter: blur(8px);" in app.CSS
+    assert "background: #ffd675;" in app.CSS
+    assert ".trial-progress-abbrev {\n    display: inline;" in app.CSS
 def test_foreground_prop_assets_have_real_transparency():
 def test_latest_speaker_sets_stage_class_and_speech_bubble():
+    html = app.render_court([_speaker_event("Mike OSS", phase="claims")], started=True)
     assert "speaker-auric" in html
+    assert "class='speech-bubble active-dialogue speaker-auric'" in html
+    assert "data-speaker='Mike OSS'" in html
+    assert "<strong>Mike OSS</strong>" in html
+    assert "test speaker" in html
+    assert "Mike OSS has the visible floor." in html
     assert "puppet auric active walking" in html
     assert "puppet sable active" not in html
+    assert html.count("class='speech-bubble") == 1
+    assert html.find("class='foreground-props'") < html.find("class='speech-bubble active-dialogue")
+    assert ".speech-bubble.active-dialogue,\n.speech-bubble.active-dialogue * {\n  color: #141413 !important;\n}" in app.CSS
+    assert "border: 2px solid #141413;" in app.CSS
+    assert "font-size: 12px;" in app.CSS
+def test_speech_bubble_uses_full_turn_content_not_event_body():
+    long_text = " ".join(["The record speaks plainly"] * 18) + " with a final visible phrase."
+    event = TrialEvent(
+        phase="questions",
+        title="Counsel answers",
+        body="Narration only, not spoken dialogue.",
+        turns=[
+            AgentTurn(
+                agent="Harvey Vector",
+                role="respondent advocate",
+                content=long_text,
+                model="test-model",
+                confidence=0.9,
+            )
+        ],
+    )
+    html = app.render_court([event], started=True)
+    bubble = html[html.index("<div class='speech-bubble") : html.index("<div class='gallery-benches")]
+    assert "with a final visible phrase." in bubble
+    assert "Narration only" not in bubble
+    assert "..." not in bubble
+def test_pending_speaker_renders_single_preparing_bubble():
+    pending = app.SpeakerCue(
+        name="Harvey Vector",
+        role="respondent advocate",
+        text="Harvey Vector is preparing a response.",
+        pending=True,
+    )
+    html = app.render_court([], started=True, pending_speaker=pending)
+    assert "class='speech-bubble active-dialogue speaker-sable pending'" in html
+    assert "data-pending='true'" in html
+    assert "Harvey Vector is preparing a response." in html
+    assert "puppet sable active walking" in html
+    assert html.count("class='speech-bubble") == 1
+def test_reading_duration_scales_with_words_and_caps():
+    assert app._reading_duration("short line") == app.MIN_READ_SECONDS
+    assert app._reading_duration("word " * 18) > app.MIN_READ_SECONDS
+    assert app._reading_duration("word " * 200) == app.MAX_READ_SECONDS
 def test_individual_juror_can_be_active_speaker():
     assert "speaker-karl-marx" in html
     assert "<a class='juror active'" in html
+    assert "class='speech-bubble active-dialogue speaker-karl-marx juror-dialogue'" in html
     assert "Liable. E1 carries the record." in html
+    assert html.count("class='speech-bubble") == 1
+def test_juror_speech_bubbles_anchor_above_side_benches():
+    assert ".speech-bubble.active-dialogue.juror-dialogue {\n  top: 42%;" in app.CSS
+    assert ".speech-bubble.active-dialogue.speaker-karl-marx,\n.speech-bubble.active-dialogue.speaker-john-stuart-mill,\n.speech-bubble.active-dialogue.speaker-confucius {\n  left: 1.5%;" in app.CSS
+    assert ".speech-bubble.active-dialogue.speaker-cleopatra-vii,\n.speech-bubble.active-dialogue.speaker-niccolo-machiavelli,\n.speech-bubble.active-dialogue.speaker-jensen-huang {\n  right: 1.5%;" in app.CSS
+    assert "--bubble-tail-x: 19%;" in app.CSS
+    assert "--bubble-tail-x: 81%;" in app.CSS
+    assert ".speech-bubble.active-dialogue.juror-dialogue,\n  .speech-bubble.active-dialogue.speaker-karl-marx" in app.CSS
+    assert "top: 500px;" in app.CSS
 def test_lawyer_movement_css_is_speaker_specific_not_phase_wide():
     assert ".phase-opening .puppet.sable" not in app.CSS
+def test_closed_book_and_key_characters_align_with_judge_table():
+    assert ".episode-book {\n  position: absolute;\n  left: 50%;\n  top: 122px;\n  z-index: 14;" in app.CSS
+    assert "width: min(980px, calc(100% - 32px));" in app.CSS
+    assert ".episode-book.closed {\n  top: 50%;\n  width: min(163px, 20vw);" in app.CSS
+    assert ".foreground-fence {\n  bottom: -6.5%;\n  width: 47%;" in app.CSS
+    assert ".judge-table-foreground {\n  left: 50%;\n  top: 20%;\n  z-index: 1;\n  width: 39.1%;" in app.CSS
+    assert ".puppet.judge {\n  left: 50%;\n  top: calc(40% + 156px);" in app.CSS
     assert ".puppet.auric {\n  left: 24%;\n  top: 87%;" in app.CSS
+    assert ".speaker-auric .puppet.auric {\n  left: 43%;\n  top: 87%;" in app.CSS
+    assert ".puppet.sable {\n  left: 75%;\n  top: 87%;" in app.CSS
+    assert ".speaker-sable .puppet.sable {\n  left: 75%;\n  top: 87%;" in app.CSS
+    assert ".puppet.clerk {\n  left: 43%;\n  top: 66%;" in app.CSS
+    assert ".puppet.auditor" not in app.CSS
+    assert ".episode-book.closed {\n    top: 640px;\n    width: 140px;" in app.CSS
+    assert ".episode-book {\n    top: 218px;\n    width: min(680px, calc(100% - 20px));" in app.CSS
+    assert ".foreground-fence {\n    bottom: -66px;\n    width: 64%;" in app.CSS
+    assert ".judge-table-foreground {\n    top: 213px;\n    width: 646px;" in app.CSS
     assert ".puppet.auric {\n    left: 20%;\n    top: 970px;" in app.CSS
+    assert ".puppet.sable {\n    left: 80%;\n    top: 970px;" in app.CSS
+    assert ".speaker-sable .puppet.sable {\n    left: 80%;\n    top: 970px;" in app.CSS
+    assert ".puppet.judge {\n    top: 576px;" not in app.CSS
+    assert ".puppet.sable {\n    left: 80%;\n    top: 640px;" not in app.CSS
+    assert ".speaker-sable .puppet.sable {\n    left: 80%;\n    top: 640px;" not in app.CSS
+    assert ".puppet.clerk {\n    left: 35%;\n    top: 880px;" in app.CSS
+    assert ".speech-bubble.active-dialogue.speaker-auditor" not in app.CSS
+def test_open_docket_book_renders_text_above_book_art():
+    html = app.render_court([])
+    assert "class='episode-book'" in html
+    assert "class='book-open-content'" in html
+    assert "Trial details" in html
+    assert "Evidence" in html
+def test_greg_case_preview_uses_cached_context_and_evidence_columns():
+    html = app.render_case_preview("Greg Heffley vs Mom")
+    assert "Greg Heffley v. Mom" in html
+    assert "diary" in html
+    assert "Evidence for Greg Heffley" in html
+    assert "Evidence for Susan Heffley" in html
+def test_custom_case_preview_renders_fillable_book_fields():
+    html = app.render_case_preview("Custom")
+    assert "episode-book custom-book" in html
+    assert "book-context-field" in html
+    assert html.count("book-claimant-field") == 3
+    assert html.count("book-respondent-field") == 3
+def test_custom_payload_builds_trial_request_packet(monkeypatch):
+    captured = {}
+    def fake_events(request):
+        captured["request"] = request
+        return iter([_event_with_lower_tab_data()])
+    monkeypatch.setattr(app, "get_events", fake_events)
+    monkeypatch.setattr(app, "_reading_duration", lambda text: 0)
+    payload = json.dumps(
+        {
+            "context": "A missing bicycle is traced to a disputed garage visit.",
+            "claimant_evidence": ["Garage text", "", "Scuffed tire mark"],
+            "respondent_evidence": ["Neighbor saw bike later", "", ""],
+        }
+    )
+    outputs = list(app.run_ui("Custom", "", "", payload, "swift", True))
+    assert outputs[-1][-1] == "Verdict sealed."
+    request = captured["request"]
+    assert request.case_id == "custom"
+    assert request.custom_case is not None
+    assert request.custom_case.context.startswith("A missing bicycle")
+    assert [item.supports for item in request.custom_case.evidence] == ["claimant", "claimant", "respondent"]
+def test_custom_payload_requires_context_and_both_evidence_sides():
+    payload = json.dumps({"context": "", "claimant_evidence": ["Only one side"], "respondent_evidence": []})
+    outputs = list(app.run_ui("Custom", "", "", payload, "swift", True))
+    assert outputs[-1][-1] == "Custom requires a trial details paragraph."
 def test_run_ui_yields_five_outputs_without_download_status(monkeypatch):
     event = _event_with_lower_tab_data()
     monkeypatch.setattr(app, "get_events", lambda request: iter([event]))
+    monkeypatch.setattr(app, "_reading_duration", lambda text: 0)
+    outputs = list(app.run_ui("Trial of Socrates", "", "", "", "swift", True))
     assert outputs
     assert all(len(output) == 5 for output in outputs)
+    assert outputs[0][-1] == "Clerk Meridian is preparing their response."
+    assert outputs[1][-1] == "Step 1: Nemotron Jury - Jury weighs the record"
     assert outputs[-1][-1] == "Verdict sealed."
     assert "download" not in outputs[-1][-1].lower()
     monkeypatch.setattr(app, "get_events", broken_events)
+    outputs = list(app.run_ui("Trial of Socrates", "", "", "", "swift", True))
     assert outputs[-1][-1] == "Model response required. Trial stopped: Marcus Aurelius unavailable: offline"
     assert "Claimant score" not in outputs[-1][0]
+def test_remote_events_uses_default_modal_endpoint_without_local_token(monkeypatch):
+    captured = {}
+    class FakeResponse:
+        def __enter__(self):
+            return self
+        def __exit__(self, exc_type, exc, traceback):
+            return False
+        def raise_for_status(self):
+            return None
+        def iter_lines(self):
+            event = _speaker_event("Clerk Meridian", phase="intake")
+            yield json.dumps(event.model_dump())
+    def fake_stream(method, endpoint, json, timeout):
+        captured["method"] = method
+        captured["endpoint"] = endpoint
+        captured["payload"] = json
+        captured["timeout"] = timeout
+        return FakeResponse()
+    monkeypatch.delenv("MODAL_TRIAL_URL", raising=False)
+    monkeypatch.delenv("HF_TOKEN", raising=False)
+    monkeypatch.setattr(app.httpx, "stream", fake_stream)
+    event = next(app.get_events(app.TrialRequest(case_id="socrates"), delay=0.0))
+    assert captured["method"] == "POST"
+    assert captured["endpoint"] == app.DEFAULT_MODAL_TRIAL_URL
+    assert captured["timeout"] == 900.0
+    assert event.turns[0].agent == "Clerk Meridian"
 def test_court_renders_sound_toggle():
     html = app.render_court([])