Spaces:
Sleeping
Sleeping
Update Judge-GPT code and README
#3
by AliIqbal05 - opened
- README.md +123 -55
- app.py +858 -160
- modal_app.py +22 -3
- sovereign_bench/cases.py +157 -24
- sovereign_bench/engine.py +243 -179
- sovereign_bench/llm.py +92 -5
- sovereign_bench/models.py +2 -0
- tests/test_cases.py +9 -1
- tests/test_engine.py +205 -28
- tests/test_ui_rendering.py +310 -15
README.md
CHANGED
|
@@ -9,97 +9,165 @@ app_file: app.py
|
|
| 9 |
pinned: false
|
| 10 |
license: mit
|
| 11 |
short_description: AI-native miniature trials under 32B.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
---
|
| 13 |
|
| 14 |
# Judge-GPT
|
| 15 |
|
| 16 |
-
Judge-GPT is a cinematic Gradio
|
| 17 |
|
| 18 |
-
The
|
| 19 |
|
| 20 |
-
|
| 21 |
-
- `openbmb/AgentCPM-Explore` for clerk/stage/verdict style.
|
| 22 |
-
- `nvidia/Nemotron-Orchestrator-8B` for juror and evidence-auditor review.
|
| 23 |
|
| 24 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
|
| 26 |
-
## What
|
| 27 |
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
- Retain decree and agent-trace export helpers in `sovereign_bench/export.py` for future UI restoration.
|
| 36 |
|
| 37 |
-
##
|
| 38 |
|
| 39 |
-
-
|
| 40 |
-
-
|
| 41 |
-
-
|
| 42 |
-
-
|
| 43 |
-
-
|
| 44 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 45 |
|
| 46 |
-
## Run
|
| 47 |
|
| 48 |
```powershell
|
| 49 |
python -m pip install -r requirements.txt
|
| 50 |
python app.py
|
| 51 |
```
|
| 52 |
|
| 53 |
-
|
| 54 |
|
| 55 |
-
|
|
|
|
|
|
|
| 56 |
|
| 57 |
-
|
| 58 |
|
| 59 |
```powershell
|
| 60 |
python -m modal deploy modal_app.py
|
| 61 |
```
|
| 62 |
|
| 63 |
-
|
| 64 |
|
| 65 |
-
|
|
|
|
|
|
|
| 66 |
|
| 67 |
-
|
| 68 |
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
|
| 73 |
-
##
|
| 74 |
|
| 75 |
-
|
| 76 |
|
| 77 |
-
|
| 78 |
-
-
|
| 79 |
-
|
| 80 |
|
| 81 |
-
|
| 82 |
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
|
| 87 |
-
|
| 88 |
|
| 89 |
-
##
|
| 90 |
|
| 91 |
-
|
| 92 |
-
-
|
| 93 |
-
|
| 94 |
-
- `sovereign_bench/retrieval.py`: live search packet construction.
|
| 95 |
-
- `sovereign_bench/models.py`: Pydantic schemas for cases, evidence, events, turns, votes, and verdicts.
|
| 96 |
-
- `sovereign_bench/cases.py`: cached demo case packets.
|
| 97 |
-
- `sovereign_bench/export.py`: dormant decree and trace writers.
|
| 98 |
-
- `modal_app.py`: Modal deployment and GPU-backed streaming endpoint.
|
| 99 |
-
- `tests/`: engine, case, and rendering regression coverage.
|
| 100 |
|
| 101 |
-
|
| 102 |
|
| 103 |
```powershell
|
| 104 |
-
python -m
|
| 105 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
pinned: false
|
| 10 |
license: mit
|
| 11 |
short_description: AI-native miniature trials under 32B.
|
| 12 |
+
tags:
|
| 13 |
+
- track:wood
|
| 14 |
+
- sponsor:openai
|
| 15 |
+
- sponsor:nvidia
|
| 16 |
+
- sponsor:modal
|
| 17 |
+
- achievement:offbrand
|
| 18 |
+
- achievement:fieldnotes
|
| 19 |
---
|
| 20 |
|
| 21 |
# Judge-GPT
|
| 22 |
|
| 23 |
+
Judge-GPT is a cinematic Gradio courtroom for the Build Small Hackathon's Thousand Token Wood track. It turns a compact evidence packet into a two-minute AI-native trial: a clerk opens the docket, two lawyers argue opposite sides, Marcus Aurelius presides, six fixed-perspective jurors vote, and the court seals a verdict.
|
| 24 |
|
| 25 |
+
The point is not legal advice. It is a small-model theater for structured disagreement: evidence is visible, roles are constrained, hidden reasoning is stripped, and every trial leaves a trace of which agent said what.
|
| 26 |
|
| 27 |
+
## Submission Links
|
|
|
|
|
|
|
| 28 |
|
| 29 |
+
- Hugging Face Space: https://huggingface.co/spaces/build-small-hackathon/JudgeGPT
|
| 30 |
+
- Demo video: https://drive.google.com/drive/folders/10pWJ7NVCsnVV7wOlqm4MGWg4Kmh4rMY2?usp=sharing
|
| 31 |
+
- Social post: TODO paste final public social post URL
|
| 32 |
+
- GitHub repo: https://github.com/aliiqbal24/BuildSmallfinal
|
| 33 |
+
- Field guide validator: https://build-small-hackathon-field-guide.hf.space/submit
|
| 34 |
|
| 35 |
+
## What Judges Should Try
|
| 36 |
|
| 37 |
+
1. Open the Space and keep the default `Trial of Socrates`.
|
| 38 |
+
2. Click `Begin Trial`.
|
| 39 |
+
3. Watch the courtroom progress from intake to verdict.
|
| 40 |
+
4. Hover the judge, clerk, lawyers, and jurors to inspect model/agent threads.
|
| 41 |
+
5. Open the `Evidence Drawer` and `Juror Panel` tabs after the verdict.
|
| 42 |
+
6. Try `Greg Heffley vs Mom` for a lighter family-court case.
|
| 43 |
+
7. Try `Custom` to write a short dispute and up to three pieces of evidence per side directly into the docket book.
|
|
|
|
| 44 |
|
| 45 |
+
## Why It Fits Build Small
|
| 46 |
|
| 47 |
+
- **Thousand Token Wood:** the app is whimsical, theatrical, and AI-native rather than a generic chatbot.
|
| 48 |
+
- **Best Use of Codex:** Codex was used throughout implementation, debugging, UI iteration, tests, and commit prep in the connected GitHub repo.
|
| 49 |
+
- **Nemotron Hardware Prize:** Nemotron is a core runtime model for the jury and juror vote generation.
|
| 50 |
+
- **Best Use of Modal:** the Gradio Space delegates live model inference to a Modal GPU streaming endpoint.
|
| 51 |
+
- **Off-Brand:** the UI pushes past stock Gradio with a custom courtroom, animated puppets, docket book, evidence props, audio cues, and verdict staging.
|
| 52 |
+
- **Field Notes:** this README documents the build idea, model choices, runtime architecture, limitations, and submission checklist.
|
| 53 |
+
|
| 54 |
+
## Small-Model Budget
|
| 55 |
+
|
| 56 |
+
Every named model is under the 32B parameter cap.
|
| 57 |
+
|
| 58 |
+
| Role | Model | Budgeted size | Used for |
|
| 59 |
+
| --- | --- | ---: | --- |
|
| 60 |
+
| Presiding advocate | `openai/gpt-oss-20b` | 20B | Judge, claimant lawyer, respondent lawyer, verdict voice |
|
| 61 |
+
| Clerk of style | `openbmb/AgentCPM-Explore` | 4B | Clerk/stage voice |
|
| 62 |
+
| Jury ring | `nvidia/Nemotron-Orchestrator-8B` | 8B | Jury panel and six juror votes |
|
| 63 |
+
|
| 64 |
+
Displayed aggregate budget: 32B. The app does not use a model above 32B.
|
| 65 |
+
|
| 66 |
+
## How It Works
|
| 67 |
+
|
| 68 |
+
Judge-GPT runs a deterministic courtroom sequence over a `CasePacket`:
|
| 69 |
+
|
| 70 |
+
1. Clerk opens the docket.
|
| 71 |
+
2. Judge frames the dispute.
|
| 72 |
+
3. Mike OSS argues for the claimant.
|
| 73 |
+
4. Harvey Vector argues for the respondent.
|
| 74 |
+
5. The evidence record is displayed without adding a third lawyer.
|
| 75 |
+
6. The judge asks a hinge question.
|
| 76 |
+
7. Each lawyer answers from their side.
|
| 77 |
+
8. Nemotron Jury retires the panel.
|
| 78 |
+
9. Six named jurors vote from distinct worldviews.
|
| 79 |
+
10. The judge announces the final verdict.
|
| 80 |
+
|
| 81 |
+
The shipped demo cases are:
|
| 82 |
+
|
| 83 |
+
- `The Polis v. Socrates`
|
| 84 |
+
- `Greg Heffley v. Mom`
|
| 85 |
+
- `Custom`, built from the docket-book fields in the UI
|
| 86 |
+
|
| 87 |
+
## Runtime Architecture
|
| 88 |
+
|
| 89 |
+
- `app.py` renders the Gradio UI, courtroom HTML/CSS, audio hooks, case preview book, and live event stream.
|
| 90 |
+
- `sovereign_bench/engine.py` orchestrates trial phases, model calls, evidence events, jury votes, verdict assembly, and trace metadata.
|
| 91 |
+
- `sovereign_bench/llm.py` builds role prompts, calls Hugging Face-compatible chat models, and rejects hidden reasoning or instruction echoes.
|
| 92 |
+
- `sovereign_bench/cases.py` contains the cached demo case packets.
|
| 93 |
+
- `modal_app.py` hosts the GPU-backed streaming endpoint used by the Space.
|
| 94 |
+
- `tests/` contains engine, case, and rendering regression tests.
|
| 95 |
+
|
| 96 |
+
The Gradio app uses `MODAL_TRIAL_URL` when set, otherwise it uses the built-in deployed Modal endpoint. The Modal app owns the Hugging Face token through a Modal secret named `huggingface`; no real credentials are committed.
|
| 97 |
|
| 98 |
+
## Run Locally
|
| 99 |
|
| 100 |
```powershell
|
| 101 |
python -m pip install -r requirements.txt
|
| 102 |
python app.py
|
| 103 |
```
|
| 104 |
|
| 105 |
+
Open:
|
| 106 |
|
| 107 |
+
```text
|
| 108 |
+
http://127.0.0.1:7860
|
| 109 |
+
```
|
| 110 |
|
| 111 |
+
## Deploy Modal Backend
|
| 112 |
|
| 113 |
```powershell
|
| 114 |
python -m modal deploy modal_app.py
|
| 115 |
```
|
| 116 |
|
| 117 |
+
After deployment, pre-warm every configured courtroom model in the deployed `sovereign-bench` app so the first trial does not wait for all GPU containers to cold start. Run this after each deploy because deployments reset Modal autoscaler overrides:
|
| 118 |
|
| 119 |
+
```powershell
|
| 120 |
+
python -m modal run modal_app.py::warm_models
|
| 121 |
+
```
|
| 122 |
|
| 123 |
+
If the endpoint changes, set the Hugging Face Space variable:
|
| 124 |
|
| 125 |
+
```text
|
| 126 |
+
MODAL_TRIAL_URL=https://your-modal-endpoint.example
|
| 127 |
+
```
|
| 128 |
|
| 129 |
+
## Deploy Hugging Face Space
|
| 130 |
|
| 131 |
+
Create or upload this repo as a Gradio Space inside the official Build Small org:
|
| 132 |
|
| 133 |
+
```text
|
| 134 |
+
build-small-hackathon/<your-space-name>
|
| 135 |
+
```
|
| 136 |
|
| 137 |
+
Space settings:
|
| 138 |
|
| 139 |
+
- SDK: Gradio
|
| 140 |
+
- App file: `app.py`
|
| 141 |
+
- Python requirements: `requirements.txt`
|
| 142 |
+
- Optional variable: `MODAL_TRIAL_URL`
|
| 143 |
+
- No Space secret is required if using the hosted Modal endpoint.
|
| 144 |
|
| 145 |
+
## Verification
|
| 146 |
|
| 147 |
+
```powershell
|
| 148 |
+
python -m pytest
|
| 149 |
+
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 150 |
|
| 151 |
+
Focused checks used during final prep:
|
| 152 |
|
| 153 |
```powershell
|
| 154 |
+
python -m pytest tests/test_engine.py tests/test_ui_rendering.py
|
| 155 |
```
|
| 156 |
+
|
| 157 |
+
## Limitations
|
| 158 |
+
|
| 159 |
+
- Judge-GPT is not legal advice and should not be used for real legal decisions.
|
| 160 |
+
- The demo packets are compact, staged evidence packets, not exhaustive source research.
|
| 161 |
+
- Model, Modal, or retrieval failures stop the current trial instead of substituting fake dialogue.
|
| 162 |
+
- Trial results are not persisted across sessions.
|
| 163 |
+
- Custom trials require a short case context and evidence from both sides.
|
| 164 |
+
|
| 165 |
+
## Final Submission Checklist
|
| 166 |
+
|
| 167 |
+
- [ ] Push the repo to the Build Small Hugging Face org as a Gradio Space.
|
| 168 |
+
- [ ] Confirm the Space launches and can complete `Trial of Socrates`.
|
| 169 |
+
- [ ] Record a short demo video showing the trial flow and verdict.
|
| 170 |
+
- [ ] Replace the `Demo video` TODO above with the final public URL.
|
| 171 |
+
- [ ] Publish one social post about the app.
|
| 172 |
+
- [ ] Replace the `Social post` TODO above with the final public URL.
|
| 173 |
+
- [ ] Run the README through the Build Small validator.
|
app.py
CHANGED
|
@@ -2,13 +2,18 @@ from __future__ import annotations
|
|
| 2 |
|
| 3 |
import json
|
| 4 |
import os
|
|
|
|
|
|
|
|
|
|
| 5 |
from collections.abc import Iterable
|
|
|
|
| 6 |
|
| 7 |
import gradio as gr
|
| 8 |
import httpx
|
| 9 |
|
| 10 |
from sovereign_bench.engine import JUDGE_NAME, JUROR_PERSONAS, stream_trial
|
| 11 |
-
from sovereign_bench.
|
|
|
|
| 12 |
|
| 13 |
|
| 14 |
def _load_env_file() -> None:
|
|
@@ -28,10 +33,16 @@ _load_env_file()
|
|
| 28 |
|
| 29 |
CASE_OPTIONS = {
|
| 30 |
"Trial of Socrates": "socrates",
|
| 31 |
-
"
|
| 32 |
-
"
|
| 33 |
}
|
| 34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
PHASE_GLYPHS = {
|
| 36 |
"pretrial": "00",
|
| 37 |
"intake": "01",
|
|
@@ -44,6 +55,24 @@ PHASE_GLYPHS = {
|
|
| 44 |
"appeal": "08",
|
| 45 |
}
|
| 46 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
AUDIO_PATHS = {
|
| 48 |
"score": "/gradio_api/file=assets/audio/courtroom.ogg",
|
| 49 |
"judgement": "/gradio_api/file=assets/audio/Judgement.ogg",
|
|
@@ -102,9 +131,9 @@ body,
|
|
| 102 |
.docket-book-controls {
|
| 103 |
position: fixed;
|
| 104 |
left: 50%;
|
| 105 |
-
top:
|
| 106 |
z-index: 9999;
|
| 107 |
-
width: min(
|
| 108 |
max-width: none;
|
| 109 |
margin: 0;
|
| 110 |
padding: 0;
|
|
@@ -202,21 +231,6 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 202 |
line-height: 1.25;
|
| 203 |
}
|
| 204 |
|
| 205 |
-
.trial-options {
|
| 206 |
-
max-width: 1120px;
|
| 207 |
-
margin: 0 auto 14px;
|
| 208 |
-
border: 1px solid rgba(255, 226, 154, .18);
|
| 209 |
-
border-radius: 6px;
|
| 210 |
-
background: rgba(18, 9, 5, .78);
|
| 211 |
-
color: #f5dfb5;
|
| 212 |
-
}
|
| 213 |
-
|
| 214 |
-
.trial-options label,
|
| 215 |
-
.trial-options span,
|
| 216 |
-
.trial-options .prose {
|
| 217 |
-
color: #f5dfb5 !important;
|
| 218 |
-
}
|
| 219 |
-
|
| 220 |
.court-episode-stage {
|
| 221 |
--spot-x: 50%;
|
| 222 |
--spot-y: 36%;
|
|
@@ -250,6 +264,70 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 250 |
z-index: 4;
|
| 251 |
}
|
| 252 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 253 |
.episode-room {
|
| 254 |
position: absolute;
|
| 255 |
inset: 0;
|
|
@@ -388,9 +466,9 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 388 |
.episode-book {
|
| 389 |
position: absolute;
|
| 390 |
left: 50%;
|
| 391 |
-
top:
|
| 392 |
-
z-index:
|
| 393 |
-
width: min(
|
| 394 |
aspect-ratio: 3 / 2;
|
| 395 |
transform: translateX(-50%) rotateX(0) rotateZ(-1deg);
|
| 396 |
transform-origin: center bottom;
|
|
@@ -400,6 +478,10 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 400 |
transition: top .85s ease, width .85s ease, transform .85s ease, filter .85s ease, opacity .85s ease;
|
| 401 |
}
|
| 402 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 403 |
.book-art {
|
| 404 |
position: absolute;
|
| 405 |
inset: 0;
|
|
@@ -416,8 +498,8 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 416 |
}
|
| 417 |
|
| 418 |
.episode-book.closed {
|
| 419 |
-
top:
|
| 420 |
-
width: min(
|
| 421 |
transform: translateX(-50%) rotateX(56deg) rotateZ(1deg);
|
| 422 |
opacity: .92;
|
| 423 |
filter: drop-shadow(0 18px 18px rgba(0, 0, 0, .45));
|
|
@@ -438,35 +520,91 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 438 |
|
| 439 |
.book-open-content {
|
| 440 |
position: absolute;
|
| 441 |
-
inset:
|
| 442 |
z-index: 2;
|
| 443 |
display: grid;
|
| 444 |
grid-template-columns: 1fr 1fr;
|
| 445 |
-
gap:
|
| 446 |
-
padding: 0
|
| 447 |
transition: opacity .35s ease;
|
| 448 |
}
|
| 449 |
|
| 450 |
.book-open-content h2 {
|
| 451 |
-
margin: 0 0
|
| 452 |
color: #4c2a12;
|
| 453 |
-
font-size:
|
| 454 |
letter-spacing: 0;
|
| 455 |
}
|
| 456 |
|
| 457 |
.book-open-content p,
|
| 458 |
.book-entry {
|
| 459 |
color: #3c2615;
|
| 460 |
-
font-size:
|
| 461 |
-
line-height: 1.
|
| 462 |
}
|
| 463 |
|
| 464 |
.book-entry {
|
| 465 |
-
margin:
|
| 466 |
padding-left: 12px;
|
| 467 |
border-left: 3px solid rgba(111, 61, 23, .36);
|
| 468 |
}
|
| 469 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 470 |
.judge-dais {
|
| 471 |
position: absolute;
|
| 472 |
left: 50%;
|
|
@@ -536,11 +674,11 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 536 |
}
|
| 537 |
|
| 538 |
.jury-benches.left {
|
| 539 |
-
left:
|
| 540 |
}
|
| 541 |
|
| 542 |
.jury-benches.right {
|
| 543 |
-
right:
|
| 544 |
}
|
| 545 |
|
| 546 |
.jury-benches.left .jury-row {
|
|
@@ -594,7 +732,7 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 594 |
}
|
| 595 |
|
| 596 |
.foreground-fence {
|
| 597 |
-
bottom: -
|
| 598 |
width: 47%;
|
| 599 |
}
|
| 600 |
|
|
@@ -610,9 +748,9 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 610 |
|
| 611 |
.judge-table-foreground {
|
| 612 |
left: 50%;
|
| 613 |
-
top:
|
| 614 |
z-index: 1;
|
| 615 |
-
width:
|
| 616 |
transform: translateX(-50%);
|
| 617 |
}
|
| 618 |
|
|
@@ -650,7 +788,7 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 650 |
|
| 651 |
.puppet.judge {
|
| 652 |
left: 50%;
|
| 653 |
-
top:
|
| 654 |
--skin: #c38a55;
|
| 655 |
--robe: #1b1b20;
|
| 656 |
--accent: #79242a;
|
|
@@ -660,7 +798,8 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 660 |
|
| 661 |
.puppet.clerk {
|
| 662 |
left: 43%;
|
| 663 |
-
top:
|
|
|
|
| 664 |
--skin: #b77b52;
|
| 665 |
--robe: #365548;
|
| 666 |
--accent: #2f6f5e;
|
|
@@ -668,7 +807,7 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 668 |
|
| 669 |
.puppet.auric {
|
| 670 |
left: 24%;
|
| 671 |
-
top:
|
| 672 |
--skin: #c9975d;
|
| 673 |
--robe: #5b2719;
|
| 674 |
--accent: #a45c25;
|
|
@@ -676,28 +815,20 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 676 |
|
| 677 |
.speaker-auric .puppet.auric {
|
| 678 |
left: 43%;
|
| 679 |
-
top:
|
| 680 |
}
|
| 681 |
|
| 682 |
.puppet.sable {
|
| 683 |
left: 75%;
|
| 684 |
-
top:
|
| 685 |
--skin: #a86d4a;
|
| 686 |
--robe: #1d3045;
|
| 687 |
--accent: #254f7a;
|
| 688 |
}
|
| 689 |
|
| 690 |
.speaker-sable .puppet.sable {
|
| 691 |
-
left:
|
| 692 |
-
top:
|
| 693 |
-
}
|
| 694 |
-
|
| 695 |
-
.puppet.auditor {
|
| 696 |
-
left: 71%;
|
| 697 |
-
top: 55%;
|
| 698 |
-
--skin: #c6a65b;
|
| 699 |
-
--robe: #4b3d1b;
|
| 700 |
-
--accent: #8d6b1f;
|
| 701 |
}
|
| 702 |
|
| 703 |
.puppet-portrait {
|
|
@@ -713,10 +844,6 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 713 |
pointer-events: none;
|
| 714 |
}
|
| 715 |
|
| 716 |
-
.phase-evidence .puppet.auditor {
|
| 717 |
-
animation: evidence-focus 1.35s ease-in-out infinite;
|
| 718 |
-
}
|
| 719 |
-
|
| 720 |
.puppet::before {
|
| 721 |
content: "";
|
| 722 |
position: absolute;
|
|
@@ -749,6 +876,11 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 749 |
linear-gradient(180deg, var(--accent), var(--robe) 52%, #130a07);
|
| 750 |
}
|
| 751 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 752 |
.puppet .mouth {
|
| 753 |
position: absolute;
|
| 754 |
left: 50%;
|
|
@@ -761,42 +893,169 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 761 |
border-radius: 0 0 18px 18px;
|
| 762 |
}
|
| 763 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 764 |
.puppet.active .mouth,
|
| 765 |
.puppet.walking .mouth {
|
| 766 |
animation: speak-mouth .5s ease-in-out infinite;
|
| 767 |
}
|
| 768 |
|
| 769 |
-
.speech-bubble {
|
| 770 |
position: absolute;
|
| 771 |
left: 50%;
|
| 772 |
-
|
| 773 |
-
|
| 774 |
-
|
| 775 |
-
|
| 776 |
-
|
| 777 |
-
|
| 778 |
-
|
| 779 |
-
|
| 780 |
-
|
| 781 |
-
|
| 782 |
-
|
|
|
|
|
|
|
| 783 |
font-size: 12px;
|
| 784 |
-
font-weight:
|
| 785 |
-
line-height: 1.
|
| 786 |
pointer-events: none;
|
| 787 |
}
|
| 788 |
|
| 789 |
-
.speech-bubble
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 790 |
content: "";
|
| 791 |
position: absolute;
|
| 792 |
-
left: 50%;
|
| 793 |
-
|
| 794 |
-
width: 14px;
|
| 795 |
-
height: 14px;
|
| 796 |
transform: translateX(-50%) rotate(45deg);
|
| 797 |
-
|
| 798 |
-
|
| 799 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 800 |
}
|
| 801 |
|
| 802 |
.tooltip {
|
|
@@ -931,11 +1190,6 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 931 |
animation: juror-react .82s ease-in-out infinite alternate;
|
| 932 |
}
|
| 933 |
|
| 934 |
-
.juror .speech-bubble {
|
| 935 |
-
bottom: calc(100% + 6px);
|
| 936 |
-
width: 230px;
|
| 937 |
-
}
|
| 938 |
-
|
| 939 |
.juror-face {
|
| 940 |
position: absolute;
|
| 941 |
left: 50%;
|
|
@@ -1195,14 +1449,43 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 1195 |
100% { transform: rotate(-18deg) translateY(0); }
|
| 1196 |
}
|
| 1197 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1198 |
@media (max-width: 820px) {
|
| 1199 |
.docket-book-controls {
|
| 1200 |
position: fixed;
|
| 1201 |
-
top:
|
| 1202 |
width: calc(100vw - 52px);
|
| 1203 |
transform: translateX(-50%) rotate(-1deg);
|
| 1204 |
}
|
| 1205 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1206 |
.court-episode-stage {
|
| 1207 |
height: 1280px;
|
| 1208 |
min-height: 1280px;
|
|
@@ -1225,21 +1508,64 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 1225 |
max-width: calc(100% - 32px);
|
| 1226 |
}
|
| 1227 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1228 |
.episode-book {
|
| 1229 |
-
top:
|
| 1230 |
width: min(680px, calc(100% - 20px));
|
| 1231 |
}
|
| 1232 |
|
| 1233 |
.episode-book.closed {
|
| 1234 |
-
top:
|
| 1235 |
-
width:
|
| 1236 |
}
|
| 1237 |
|
| 1238 |
.book-open-content {
|
| 1239 |
grid-template-columns: 1fr;
|
| 1240 |
gap: 10px;
|
| 1241 |
-
inset:
|
| 1242 |
-
padding: 0
|
| 1243 |
}
|
| 1244 |
|
| 1245 |
.book-open-content h2 {
|
|
@@ -1257,6 +1583,25 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 1257 |
margin: 5px 0;
|
| 1258 |
}
|
| 1259 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1260 |
.judge-dais {
|
| 1261 |
top: 390px;
|
| 1262 |
width: 280px;
|
|
@@ -1278,32 +1623,27 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 1278 |
|
| 1279 |
.puppet.auric {
|
| 1280 |
left: 20%;
|
| 1281 |
-
top:
|
| 1282 |
}
|
| 1283 |
|
| 1284 |
.puppet.sable {
|
| 1285 |
left: 80%;
|
| 1286 |
-
top:
|
| 1287 |
}
|
| 1288 |
|
| 1289 |
.speaker-auric .puppet.auric {
|
| 1290 |
left: 42%;
|
| 1291 |
-
top:
|
| 1292 |
}
|
| 1293 |
|
| 1294 |
.speaker-sable .puppet.sable {
|
| 1295 |
-
left:
|
| 1296 |
-
top:
|
| 1297 |
}
|
| 1298 |
|
| 1299 |
.puppet.clerk {
|
| 1300 |
left: 35%;
|
| 1301 |
-
top:
|
| 1302 |
-
}
|
| 1303 |
-
|
| 1304 |
-
.puppet.auditor {
|
| 1305 |
-
left: 78%;
|
| 1306 |
-
top: 540px;
|
| 1307 |
}
|
| 1308 |
|
| 1309 |
.witness-area {
|
|
@@ -1319,15 +1659,15 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 1319 |
}
|
| 1320 |
|
| 1321 |
.jury-benches.left {
|
| 1322 |
-
left: 5%;
|
| 1323 |
}
|
| 1324 |
|
| 1325 |
.jury-benches.right {
|
| 1326 |
-
right: 5%;
|
| 1327 |
}
|
| 1328 |
|
| 1329 |
.foreground-fence {
|
| 1330 |
-
bottom: -
|
| 1331 |
width: 64%;
|
| 1332 |
}
|
| 1333 |
|
|
@@ -1340,8 +1680,8 @@ body.trial-has-started .docket-book-controls .docket-book-controls {
|
|
| 1340 |
}
|
| 1341 |
|
| 1342 |
.judge-table-foreground {
|
| 1343 |
-
top:
|
| 1344 |
-
width:
|
| 1345 |
}
|
| 1346 |
|
| 1347 |
.evidence-props {
|
|
@@ -1530,12 +1870,28 @@ APP_HEAD = f"""
|
|
| 1530 |
"""
|
| 1531 |
|
| 1532 |
START_JS = """
|
| 1533 |
-
(case_label, search_query, hypothetical, speed, mind_layer) => {
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1534 |
document.body.classList.add('trial-has-started');
|
| 1535 |
if (window.SovereignCourtAudio) {
|
| 1536 |
window.SovereignCourtAudio.begin();
|
| 1537 |
}
|
| 1538 |
-
return [case_label, search_query, hypothetical, speed, mind_layer];
|
| 1539 |
}
|
| 1540 |
"""
|
| 1541 |
|
|
@@ -1553,24 +1909,18 @@ CHARACTERS = {
|
|
| 1553 |
"role": "Court clerk",
|
| 1554 |
"model": "AgentCPM-Explore",
|
| 1555 |
},
|
| 1556 |
-
"
|
| 1557 |
"class": "auric",
|
| 1558 |
-
"name": "
|
| 1559 |
"role": "Claimant advocate",
|
| 1560 |
"model": "gpt-oss-20b",
|
| 1561 |
},
|
| 1562 |
-
"
|
| 1563 |
"class": "sable",
|
| 1564 |
-
"name": "
|
| 1565 |
"role": "Respondent advocate",
|
| 1566 |
"model": "gpt-oss-20b",
|
| 1567 |
},
|
| 1568 |
-
"Auditor Prism": {
|
| 1569 |
-
"class": "auditor",
|
| 1570 |
-
"name": "Auditor Prism",
|
| 1571 |
-
"role": "Evidence auditor",
|
| 1572 |
-
"model": "Nemotron-Orchestrator-8B",
|
| 1573 |
-
},
|
| 1574 |
"Nemotron Jury": {
|
| 1575 |
"class": "jury",
|
| 1576 |
"name": "Nemotron Jury",
|
|
@@ -1597,13 +1947,37 @@ JUROR_IMAGES = {
|
|
| 1597 |
"Jensen Huang": "/gradio_api/file=assets/characters/jensen-huang.png",
|
| 1598 |
}
|
| 1599 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1600 |
PHASE_AGENTS = {
|
| 1601 |
"pretrial": ["Clerk Meridian"],
|
| 1602 |
}
|
| 1603 |
|
| 1604 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1605 |
def _remote_events(request: TrialRequest) -> Iterable[TrialEvent] | None:
|
| 1606 |
-
endpoint = os.getenv("MODAL_TRIAL_URL",
|
| 1607 |
if not endpoint:
|
| 1608 |
return None
|
| 1609 |
|
|
@@ -1617,13 +1991,13 @@ def _remote_events(request: TrialRequest) -> Iterable[TrialEvent] | None:
|
|
| 1617 |
return iterator()
|
| 1618 |
|
| 1619 |
|
| 1620 |
-
def get_events(request: TrialRequest) -> Iterable[TrialEvent]:
|
| 1621 |
remote = _remote_events(request)
|
| 1622 |
if remote is not None:
|
| 1623 |
yield from remote
|
| 1624 |
return
|
| 1625 |
-
|
| 1626 |
-
yield from stream_trial(request, delay=
|
| 1627 |
|
| 1628 |
|
| 1629 |
def _escape(value: str) -> str:
|
|
@@ -1663,6 +2037,26 @@ def _active_speaker_for(event: TrialEvent | None) -> str:
|
|
| 1663 |
return event.turns[0].agent
|
| 1664 |
|
| 1665 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1666 |
def _speaker_class_for(speaker: str) -> str:
|
| 1667 |
if not speaker:
|
| 1668 |
return ""
|
|
@@ -1680,6 +2074,61 @@ def _latest_turn_text(event: TrialEvent | None, agent: str) -> str:
|
|
| 1680 |
return _short_text(turn.content, 210)
|
| 1681 |
|
| 1682 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1683 |
def _thread_id(name: str) -> str:
|
| 1684 |
return "ai-thread-" + "".join(ch.lower() if ch.isalnum() else "-" for ch in name).strip("-")
|
| 1685 |
|
|
@@ -1767,17 +2216,51 @@ def _thread_modal(name: str, role: str, model: str, turns: list[dict[str, str]])
|
|
| 1767 |
)
|
| 1768 |
|
| 1769 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1770 |
def _puppet(agent: str, active_agents: set[str], phase: str, events: list[TrialEvent], latest: TrialEvent | None) -> str:
|
| 1771 |
meta = CHARACTERS[agent]
|
| 1772 |
active = " active" if agent in active_agents else ""
|
| 1773 |
-
walking = " walking" if agent in {"
|
| 1774 |
-
small = " small" if agent
|
| 1775 |
turns = _thread_for_character(events, agent)
|
| 1776 |
-
bubble = ""
|
| 1777 |
-
if agent in active_agents:
|
| 1778 |
-
speech = _latest_turn_text(latest, agent)
|
| 1779 |
-
if speech:
|
| 1780 |
-
bubble = f"<span class='speech-bubble'>{_escape(speech)}</span>"
|
| 1781 |
portrait = ""
|
| 1782 |
if meta.get("image"):
|
| 1783 |
portrait = (
|
|
@@ -1788,7 +2271,6 @@ def _puppet(agent: str, active_agents: set[str], phase: str, events: list[TrialE
|
|
| 1788 |
f"<a class='puppet {meta['class']}{active}{walking}{small}' href='#{_escape(_thread_id(agent))}' aria-label='Open {_escape(agent)} model thread'>"
|
| 1789 |
f"{portrait}"
|
| 1790 |
"<span class='mouth'></span>"
|
| 1791 |
-
f"{bubble}"
|
| 1792 |
f"{_tooltip(meta['name'], meta['role'], meta['model'], turns)}"
|
| 1793 |
"</a>"
|
| 1794 |
)
|
|
@@ -1799,36 +2281,103 @@ def _juror(name: str, active: bool, events: list[TrialEvent] | None = None, late
|
|
| 1799 |
image = JUROR_IMAGES.get(name, "")
|
| 1800 |
active_cls = " active" if active else ""
|
| 1801 |
turns = _thread_for_character(events or [], name)
|
| 1802 |
-
bubble = ""
|
| 1803 |
-
if active:
|
| 1804 |
-
vote = next((vote for vote in (latest.votes if latest else []) if vote.juror == name), None)
|
| 1805 |
-
speech = _latest_turn_text(latest, name)
|
| 1806 |
-
if vote:
|
| 1807 |
-
speech = f"{vote.vote.replace('_', ' ').title()}. {vote.reason}"
|
| 1808 |
-
if speech:
|
| 1809 |
-
bubble = f"<span class='speech-bubble'>{_escape(_short_text(speech, 190))}</span>"
|
| 1810 |
portrait = (
|
| 1811 |
f"<img class='juror-portrait' src='{_escape(image)}' alt='{_escape(name)} bust' "
|
| 1812 |
"onerror=\"this.style.display='none'\">"
|
| 1813 |
if image
|
| 1814 |
else ""
|
| 1815 |
)
|
|
|
|
| 1816 |
return (
|
| 1817 |
f"<a class='juror{active_cls}' href='#{_escape(_thread_id(name))}' style='--face: {face}' aria-label='Open {_escape(name)} model thread'>"
|
| 1818 |
f"{portrait}"
|
| 1819 |
-
"
|
| 1820 |
-
f"{bubble}"
|
| 1821 |
f"{_tooltip(name, 'HF-style juror', 'Nemotron panel', turns)}"
|
| 1822 |
"</a>"
|
| 1823 |
)
|
| 1824 |
|
| 1825 |
|
| 1826 |
-
def
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1827 |
closed = "" if open_book else " closed"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1828 |
return (
|
| 1829 |
-
f"<div class='episode-book{closed}'>"
|
| 1830 |
"<img class='book-art open-art' src='/gradio_api/file=assets/book/docket-book-open.png' alt='Open docket book'>"
|
| 1831 |
"<img class='book-art closed-art' src='/gradio_api/file=assets/book/docket-book-closed.png' alt='Closed docket book'>"
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1832 |
"</div>"
|
| 1833 |
)
|
| 1834 |
|
|
@@ -1871,6 +2420,36 @@ def _foreground_props() -> str:
|
|
| 1871 |
)
|
| 1872 |
|
| 1873 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1874 |
def _courtroom_juror_names(votes: list) -> list[str]:
|
| 1875 |
names = list(JUROR_FACES)
|
| 1876 |
names.extend(vote.juror for vote in votes if vote.juror not in names)
|
|
@@ -1887,12 +2466,20 @@ def _latest_votes(events: list[TrialEvent]) -> list:
|
|
| 1887 |
return ordered
|
| 1888 |
|
| 1889 |
|
| 1890 |
-
def render_court(
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1891 |
latest = events[-1] if events else None
|
| 1892 |
phase = latest.phase if latest else "pretrial"
|
| 1893 |
title, subtitle = _latest_packet_title(events)
|
| 1894 |
-
|
| 1895 |
-
active_speaker = _active_speaker_for(latest)
|
|
|
|
| 1896 |
speaker_cls = _speaker_class_for(active_speaker)
|
| 1897 |
caption_phase, caption_title, caption_body = _caption(latest, phase)
|
| 1898 |
latest_votes = _latest_votes(events)
|
|
@@ -1901,7 +2488,7 @@ def render_court(events: list[TrialEvent], started: bool = False) -> str:
|
|
| 1901 |
book_open = not started and not events
|
| 1902 |
puppets = "".join(
|
| 1903 |
_puppet(agent, active_agents, phase, events, latest)
|
| 1904 |
-
for agent in [JUDGE_NAME, "Clerk Meridian", "
|
| 1905 |
)
|
| 1906 |
left_jurors = "".join(_juror(name, name == active_speaker, events, latest) for name in juror_names[:3])
|
| 1907 |
right_jurors = "".join(_juror(name, name == active_speaker, events, latest) for name in juror_names[3:6])
|
|
@@ -1915,6 +2502,7 @@ def render_court(events: list[TrialEvent], started: bool = False) -> str:
|
|
| 1915 |
)
|
| 1916 |
return (
|
| 1917 |
f"<section id='court-stage' class='court-episode-stage phase-{_escape(phase)}{_escape(speaker_cls)}{started_cls}' data-phase='{_escape(phase)}'>"
|
|
|
|
| 1918 |
"<div class='episode-room'></div>"
|
| 1919 |
"<div class='audio-deck' aria-hidden='true'>"
|
| 1920 |
+ "".join(f"<audio preload='auto' src='{_escape(src)}'></audio>" for src in AUDIO_PATHS.values())
|
|
@@ -1926,7 +2514,7 @@ def render_court(events: list[TrialEvent], started: bool = False) -> str:
|
|
| 1926 |
f"<h1>{_escape(title)}</h1>"
|
| 1927 |
f"<p>{_escape(subtitle)}</p></div>"
|
| 1928 |
f"<div class='decree-ribbon'>Step {len(events) if events else 0}: {caption_title}<br>Hover characters for agent and model details</div>"
|
| 1929 |
-
f"{_book(book_open)}"
|
| 1930 |
f"<div class='judge-dais'><div class='prop-label'>{_escape(JUDGE_NAME)}</div><div class='bench-front'></div><span class='gavel'></span></div>"
|
| 1931 |
"<div class='counsel-table left'><div class='prop-label'>Claimant Table</div></div>"
|
| 1932 |
"<div class='counsel-table right'><div class='prop-label'>Respondent Table</div></div>"
|
|
@@ -1939,6 +2527,8 @@ def render_court(events: list[TrialEvent], started: bool = False) -> str:
|
|
| 1939 |
f"{puppets}"
|
| 1940 |
f"{evidence_props}"
|
| 1941 |
f"{_foreground_props()}"
|
|
|
|
|
|
|
| 1942 |
"<div class='gallery-benches'><div></div><div></div><div></div><div></div><div></div><div></div></div>"
|
| 1943 |
"<div class='trial-caption'>"
|
| 1944 |
f"<div class='caption-phase'>Live Trial Feed / {_escape(caption_phase)}</div>"
|
|
@@ -2009,33 +2599,137 @@ def render_mind(events: list[TrialEvent], enabled: bool) -> str:
|
|
| 2009 |
return f"<pre class='mind-text'>{_escape(json.dumps(compact, indent=2))}</pre>"
|
| 2010 |
|
| 2011 |
|
| 2012 |
-
def
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2013 |
request = TrialRequest(
|
| 2014 |
-
case_id=
|
| 2015 |
search_query=search_query or "",
|
| 2016 |
hypothetical=hypothetical or "",
|
|
|
|
| 2017 |
speed=speed or "swift",
|
| 2018 |
mind_layer=bool(mind_layer),
|
| 2019 |
)
|
| 2020 |
events: list[TrialEvent] = []
|
|
|
|
|
|
|
| 2021 |
yield (
|
| 2022 |
-
render_court(events, started=True),
|
| 2023 |
render_evidence(events),
|
| 2024 |
render_jurors(events),
|
| 2025 |
render_mind(events, mind_layer),
|
| 2026 |
-
|
| 2027 |
)
|
| 2028 |
try:
|
| 2029 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2030 |
events.append(event)
|
| 2031 |
-
status = f"Step {len(events)}: {event.title}"
|
| 2032 |
yield (
|
| 2033 |
render_court(events, started=True),
|
| 2034 |
render_evidence(events),
|
| 2035 |
render_jurors(events),
|
| 2036 |
render_mind(events, mind_layer),
|
| 2037 |
-
|
| 2038 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2039 |
except Exception as exc:
|
| 2040 |
yield (
|
| 2041 |
render_court(events, started=True),
|
|
@@ -2046,7 +2740,7 @@ def run_ui(case_label: str, search_query: str, hypothetical: str, speed: str, mi
|
|
| 2046 |
)
|
| 2047 |
return
|
| 2048 |
yield (
|
| 2049 |
-
render_court(events, started=True),
|
| 2050 |
render_evidence(events),
|
| 2051 |
render_jurors(events),
|
| 2052 |
render_mind(events, mind_layer),
|
|
@@ -2067,13 +2761,12 @@ def build_app() -> gr.Blocks:
|
|
| 2067 |
)
|
| 2068 |
start = gr.Button("Begin Trial", variant="primary", scale=1)
|
| 2069 |
status = gr.Markdown("Ready.", elem_classes=["book-status"])
|
| 2070 |
-
courtroom = gr.HTML(
|
| 2071 |
search = gr.State("")
|
|
|
|
|
|
|
| 2072 |
speed = gr.State("swift")
|
| 2073 |
mind = gr.State(True)
|
| 2074 |
-
with gr.Accordion("Advanced trial options", open=False, elem_classes=["trial-options"]):
|
| 2075 |
-
with gr.Row():
|
| 2076 |
-
hypo = gr.Textbox(label="Hypothetical sidebar", lines=1)
|
| 2077 |
with gr.Row(elem_classes=["drawer-shell"]):
|
| 2078 |
with gr.Column(scale=1):
|
| 2079 |
with gr.Tab("Evidence Drawer"):
|
|
@@ -2081,9 +2774,14 @@ def build_app() -> gr.Blocks:
|
|
| 2081 |
with gr.Tab("Juror Panel"):
|
| 2082 |
jurors = gr.HTML(render_jurors([]))
|
| 2083 |
mind_html = gr.HTML(render_mind([], True), visible=False)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2084 |
start.click(
|
| 2085 |
run_ui,
|
| 2086 |
-
inputs=[case, search, hypo, speed, mind],
|
| 2087 |
outputs=[courtroom, evidence, jurors, mind_html, status],
|
| 2088 |
js=START_JS,
|
| 2089 |
)
|
|
|
|
| 2 |
|
| 3 |
import json
|
| 4 |
import os
|
| 5 |
+
import queue
|
| 6 |
+
import threading
|
| 7 |
+
import time
|
| 8 |
from collections.abc import Iterable
|
| 9 |
+
from dataclasses import dataclass
|
| 10 |
|
| 11 |
import gradio as gr
|
| 12 |
import httpx
|
| 13 |
|
| 14 |
from sovereign_bench.engine import JUDGE_NAME, JUROR_PERSONAS, stream_trial
|
| 15 |
+
from sovereign_bench.cases import CASES, get_case
|
| 16 |
+
from sovereign_bench.models import CasePacket, EvidenceItem, TrialEvent, TrialRequest
|
| 17 |
|
| 18 |
|
| 19 |
def _load_env_file() -> None:
|
|
|
|
| 33 |
|
| 34 |
CASE_OPTIONS = {
|
| 35 |
"Trial of Socrates": "socrates",
|
| 36 |
+
"Greg Heffley vs Mom": "greg",
|
| 37 |
+
"Custom": "custom",
|
| 38 |
}
|
| 39 |
|
| 40 |
+
DEFAULT_MODAL_TRIAL_URL = "https://ali-j-iqbal24--trial-stream.modal.run"
|
| 41 |
+
MIN_READ_SECONDS = 2.2
|
| 42 |
+
WORDS_PER_SECOND = 3.2
|
| 43 |
+
READ_BUFFER_SECONDS = 0.8
|
| 44 |
+
MAX_READ_SECONDS = 8.5
|
| 45 |
+
|
| 46 |
PHASE_GLYPHS = {
|
| 47 |
"pretrial": "00",
|
| 48 |
"intake": "01",
|
|
|
|
| 55 |
"appeal": "08",
|
| 56 |
}
|
| 57 |
|
| 58 |
+
TRIAL_PROGRESS_STAGES = (
|
| 59 |
+
("pretrial", "Pretrial"),
|
| 60 |
+
("intake", "Intake"),
|
| 61 |
+
("claims", "Claims"),
|
| 62 |
+
("opening", "Opening"),
|
| 63 |
+
("evidence", "Evidence"),
|
| 64 |
+
("questions", "Questions"),
|
| 65 |
+
("deliberation", "Deliberation"),
|
| 66 |
+
("verdict", "Verdict"),
|
| 67 |
+
)
|
| 68 |
+
|
| 69 |
+
VERDICT_LABELS = {
|
| 70 |
+
"liable": "Guilty",
|
| 71 |
+
"not_liable": "Not Guilty",
|
| 72 |
+
"mixed": "Mixed",
|
| 73 |
+
"uncertain": "Uncertain",
|
| 74 |
+
}
|
| 75 |
+
|
| 76 |
AUDIO_PATHS = {
|
| 77 |
"score": "/gradio_api/file=assets/audio/courtroom.ogg",
|
| 78 |
"judgement": "/gradio_api/file=assets/audio/Judgement.ogg",
|
|
|
|
| 131 |
.docket-book-controls {
|
| 132 |
position: fixed;
|
| 133 |
left: 50%;
|
| 134 |
+
top: 72px;
|
| 135 |
z-index: 9999;
|
| 136 |
+
width: min(760px, calc(100vw - 160px));
|
| 137 |
max-width: none;
|
| 138 |
margin: 0;
|
| 139 |
padding: 0;
|
|
|
|
| 231 |
line-height: 1.25;
|
| 232 |
}
|
| 233 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 234 |
.court-episode-stage {
|
| 235 |
--spot-x: 50%;
|
| 236 |
--spot-y: 36%;
|
|
|
|
| 264 |
z-index: 4;
|
| 265 |
}
|
| 266 |
|
| 267 |
+
.trial-progress {
|
| 268 |
+
position: fixed;
|
| 269 |
+
top: 0;
|
| 270 |
+
left: 0;
|
| 271 |
+
right: 0;
|
| 272 |
+
z-index: 70;
|
| 273 |
+
display: grid;
|
| 274 |
+
grid-template-columns: repeat(8, minmax(0, 1fr));
|
| 275 |
+
gap: 1px;
|
| 276 |
+
padding: 3px clamp(10px, 2vw, 24px) 4px;
|
| 277 |
+
border-bottom: 1px solid rgba(217, 176, 96, .2);
|
| 278 |
+
background: rgba(23, 13, 8, .58);
|
| 279 |
+
backdrop-filter: blur(8px);
|
| 280 |
+
box-shadow: 0 8px 18px rgba(8, 4, 2, .22);
|
| 281 |
+
pointer-events: none;
|
| 282 |
+
}
|
| 283 |
+
|
| 284 |
+
.trial-progress-segment {
|
| 285 |
+
position: relative;
|
| 286 |
+
min-width: 0;
|
| 287 |
+
padding-top: 5px;
|
| 288 |
+
overflow: hidden;
|
| 289 |
+
color: rgba(244, 213, 143, .38);
|
| 290 |
+
font: 800 10px/1 ui-monospace, SFMono-Regular, Consolas, monospace;
|
| 291 |
+
letter-spacing: .04em;
|
| 292 |
+
text-align: center;
|
| 293 |
+
text-transform: uppercase;
|
| 294 |
+
white-space: nowrap;
|
| 295 |
+
}
|
| 296 |
+
|
| 297 |
+
.trial-progress-segment::before {
|
| 298 |
+
content: "";
|
| 299 |
+
position: absolute;
|
| 300 |
+
left: 3px;
|
| 301 |
+
right: 3px;
|
| 302 |
+
top: 0;
|
| 303 |
+
height: 2px;
|
| 304 |
+
border-radius: 999px;
|
| 305 |
+
background: rgba(217, 176, 96, .18);
|
| 306 |
+
}
|
| 307 |
+
|
| 308 |
+
.trial-progress-segment.complete {
|
| 309 |
+
color: rgba(217, 176, 96, .68);
|
| 310 |
+
}
|
| 311 |
+
|
| 312 |
+
.trial-progress-segment.complete::before {
|
| 313 |
+
background: rgba(217, 176, 96, .48);
|
| 314 |
+
}
|
| 315 |
+
|
| 316 |
+
.trial-progress-segment.current {
|
| 317 |
+
color: #ffe6a6;
|
| 318 |
+
text-shadow: 0 0 10px rgba(255, 211, 116, .52);
|
| 319 |
+
}
|
| 320 |
+
|
| 321 |
+
.trial-progress-segment.current::before {
|
| 322 |
+
height: 3px;
|
| 323 |
+
background: #ffd675;
|
| 324 |
+
box-shadow: 0 0 12px rgba(255, 214, 117, .68);
|
| 325 |
+
}
|
| 326 |
+
|
| 327 |
+
.trial-progress-abbrev {
|
| 328 |
+
display: none;
|
| 329 |
+
}
|
| 330 |
+
|
| 331 |
.episode-room {
|
| 332 |
position: absolute;
|
| 333 |
inset: 0;
|
|
|
|
| 466 |
.episode-book {
|
| 467 |
position: absolute;
|
| 468 |
left: 50%;
|
| 469 |
+
top: 122px;
|
| 470 |
+
z-index: 14;
|
| 471 |
+
width: min(980px, calc(100% - 32px));
|
| 472 |
aspect-ratio: 3 / 2;
|
| 473 |
transform: translateX(-50%) rotateX(0) rotateZ(-1deg);
|
| 474 |
transform-origin: center bottom;
|
|
|
|
| 478 |
transition: top .85s ease, width .85s ease, transform .85s ease, filter .85s ease, opacity .85s ease;
|
| 479 |
}
|
| 480 |
|
| 481 |
+
.episode-book.custom-book {
|
| 482 |
+
pointer-events: auto;
|
| 483 |
+
}
|
| 484 |
+
|
| 485 |
.book-art {
|
| 486 |
position: absolute;
|
| 487 |
inset: 0;
|
|
|
|
| 498 |
}
|
| 499 |
|
| 500 |
.episode-book.closed {
|
| 501 |
+
top: 50%;
|
| 502 |
+
width: min(163px, 20vw);
|
| 503 |
transform: translateX(-50%) rotateX(56deg) rotateZ(1deg);
|
| 504 |
opacity: .92;
|
| 505 |
filter: drop-shadow(0 18px 18px rgba(0, 0, 0, .45));
|
|
|
|
| 520 |
|
| 521 |
.book-open-content {
|
| 522 |
position: absolute;
|
| 523 |
+
inset: 15% 9% 12%;
|
| 524 |
z-index: 2;
|
| 525 |
display: grid;
|
| 526 |
grid-template-columns: 1fr 1fr;
|
| 527 |
+
gap: 82px;
|
| 528 |
+
padding: 0 20px;
|
| 529 |
transition: opacity .35s ease;
|
| 530 |
}
|
| 531 |
|
| 532 |
.book-open-content h2 {
|
| 533 |
+
margin: 0 0 8px;
|
| 534 |
color: #4c2a12;
|
| 535 |
+
font-size: 28px;
|
| 536 |
letter-spacing: 0;
|
| 537 |
}
|
| 538 |
|
| 539 |
.book-open-content p,
|
| 540 |
.book-entry {
|
| 541 |
color: #3c2615;
|
| 542 |
+
font-size: 14px;
|
| 543 |
+
line-height: 1.28;
|
| 544 |
}
|
| 545 |
|
| 546 |
.book-entry {
|
| 547 |
+
margin: 8px 0;
|
| 548 |
padding-left: 12px;
|
| 549 |
border-left: 3px solid rgba(111, 61, 23, .36);
|
| 550 |
}
|
| 551 |
|
| 552 |
+
.book-context {
|
| 553 |
+
margin-top: 8px;
|
| 554 |
+
}
|
| 555 |
+
|
| 556 |
+
.book-case-title {
|
| 557 |
+
margin: 0 0 6px;
|
| 558 |
+
color: #4c2a12;
|
| 559 |
+
font-weight: 800;
|
| 560 |
+
}
|
| 561 |
+
|
| 562 |
+
.book-evidence-columns {
|
| 563 |
+
display: grid;
|
| 564 |
+
grid-template-columns: 1fr 1fr;
|
| 565 |
+
gap: 12px;
|
| 566 |
+
}
|
| 567 |
+
|
| 568 |
+
.book-evidence-column h3 {
|
| 569 |
+
margin: 0 0 6px;
|
| 570 |
+
color: #4c2a12;
|
| 571 |
+
font-size: 15px;
|
| 572 |
+
line-height: 1.12;
|
| 573 |
+
}
|
| 574 |
+
|
| 575 |
+
.book-evidence-list {
|
| 576 |
+
margin: 0;
|
| 577 |
+
padding: 0;
|
| 578 |
+
list-style: none;
|
| 579 |
+
}
|
| 580 |
+
|
| 581 |
+
.book-evidence-list li {
|
| 582 |
+
margin: 0 0 6px;
|
| 583 |
+
padding-left: 9px;
|
| 584 |
+
border-left: 2px solid rgba(111, 61, 23, .32);
|
| 585 |
+
color: #3c2615;
|
| 586 |
+
font-size: 12px;
|
| 587 |
+
line-height: 1.2;
|
| 588 |
+
}
|
| 589 |
+
|
| 590 |
+
.book-field {
|
| 591 |
+
width: 100%;
|
| 592 |
+
min-height: 42px;
|
| 593 |
+
resize: none;
|
| 594 |
+
border: 1px solid rgba(90, 50, 20, .34);
|
| 595 |
+
border-radius: 4px;
|
| 596 |
+
background: rgba(255, 247, 224, .7);
|
| 597 |
+
color: #2b1b10;
|
| 598 |
+
font: 12px/1.22 Georgia, "Times New Roman", serif;
|
| 599 |
+
box-shadow: inset 0 1px 2px rgba(59, 29, 10, .16);
|
| 600 |
+
pointer-events: auto;
|
| 601 |
+
}
|
| 602 |
+
|
| 603 |
+
.book-context-field {
|
| 604 |
+
min-height: 138px;
|
| 605 |
+
font-size: 13px;
|
| 606 |
+
}
|
| 607 |
+
|
| 608 |
.judge-dais {
|
| 609 |
position: absolute;
|
| 610 |
left: 50%;
|
|
|
|
| 674 |
}
|
| 675 |
|
| 676 |
.jury-benches.left {
|
| 677 |
+
left: 1%;
|
| 678 |
}
|
| 679 |
|
| 680 |
.jury-benches.right {
|
| 681 |
+
right: 1%;
|
| 682 |
}
|
| 683 |
|
| 684 |
.jury-benches.left .jury-row {
|
|
|
|
| 732 |
}
|
| 733 |
|
| 734 |
.foreground-fence {
|
| 735 |
+
bottom: -6.5%;
|
| 736 |
width: 47%;
|
| 737 |
}
|
| 738 |
|
|
|
|
| 748 |
|
| 749 |
.judge-table-foreground {
|
| 750 |
left: 50%;
|
| 751 |
+
top: 20%;
|
| 752 |
z-index: 1;
|
| 753 |
+
width: 39.1%;
|
| 754 |
transform: translateX(-50%);
|
| 755 |
}
|
| 756 |
|
|
|
|
| 788 |
|
| 789 |
.puppet.judge {
|
| 790 |
left: 50%;
|
| 791 |
+
top: calc(40% + 156px);
|
| 792 |
--skin: #c38a55;
|
| 793 |
--robe: #1b1b20;
|
| 794 |
--accent: #79242a;
|
|
|
|
| 798 |
|
| 799 |
.puppet.clerk {
|
| 800 |
left: 43%;
|
| 801 |
+
top: 66%;
|
| 802 |
+
z-index: 14;
|
| 803 |
--skin: #b77b52;
|
| 804 |
--robe: #365548;
|
| 805 |
--accent: #2f6f5e;
|
|
|
|
| 807 |
|
| 808 |
.puppet.auric {
|
| 809 |
left: 24%;
|
| 810 |
+
top: 87%;
|
| 811 |
--skin: #c9975d;
|
| 812 |
--robe: #5b2719;
|
| 813 |
--accent: #a45c25;
|
|
|
|
| 815 |
|
| 816 |
.speaker-auric .puppet.auric {
|
| 817 |
left: 43%;
|
| 818 |
+
top: 87%;
|
| 819 |
}
|
| 820 |
|
| 821 |
.puppet.sable {
|
| 822 |
left: 75%;
|
| 823 |
+
top: 87%;
|
| 824 |
--skin: #a86d4a;
|
| 825 |
--robe: #1d3045;
|
| 826 |
--accent: #254f7a;
|
| 827 |
}
|
| 828 |
|
| 829 |
.speaker-sable .puppet.sable {
|
| 830 |
+
left: 75%;
|
| 831 |
+
top: 87%;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 832 |
}
|
| 833 |
|
| 834 |
.puppet-portrait {
|
|
|
|
| 844 |
pointer-events: none;
|
| 845 |
}
|
| 846 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 847 |
.puppet::before {
|
| 848 |
content: "";
|
| 849 |
position: absolute;
|
|
|
|
| 876 |
linear-gradient(180deg, var(--accent), var(--robe) 52%, #130a07);
|
| 877 |
}
|
| 878 |
|
| 879 |
+
.puppet.judge::before,
|
| 880 |
+
.puppet.judge::after {
|
| 881 |
+
display: none;
|
| 882 |
+
}
|
| 883 |
+
|
| 884 |
.puppet .mouth {
|
| 885 |
position: absolute;
|
| 886 |
left: 50%;
|
|
|
|
| 893 |
border-radius: 0 0 18px 18px;
|
| 894 |
}
|
| 895 |
|
| 896 |
+
.puppet.judge .mouth {
|
| 897 |
+
display: none;
|
| 898 |
+
}
|
| 899 |
+
|
| 900 |
.puppet.active .mouth,
|
| 901 |
.puppet.walking .mouth {
|
| 902 |
animation: speak-mouth .5s ease-in-out infinite;
|
| 903 |
}
|
| 904 |
|
| 905 |
+
.speech-bubble.active-dialogue {
|
| 906 |
position: absolute;
|
| 907 |
left: 50%;
|
| 908 |
+
top: 43%;
|
| 909 |
+
bottom: auto;
|
| 910 |
+
z-index: 30;
|
| 911 |
+
width: min(500px, calc(100vw - 44px));
|
| 912 |
+
max-height: 34vh;
|
| 913 |
+
overflow: visible;
|
| 914 |
+
transform: translate(-50%, -100%);
|
| 915 |
+
padding: 10px 13px 11px;
|
| 916 |
+
border: 2px solid #141413;
|
| 917 |
+
border-radius: 20px;
|
| 918 |
+
background: rgba(255, 253, 247, .97);
|
| 919 |
+
color: #141413 !important;
|
| 920 |
+
box-shadow: 0 12px 24px rgba(0, 0, 0, .32);
|
| 921 |
font-size: 12px;
|
| 922 |
+
font-weight: 650;
|
| 923 |
+
line-height: 1.32;
|
| 924 |
pointer-events: none;
|
| 925 |
}
|
| 926 |
|
| 927 |
+
.speech-bubble.active-dialogue,
|
| 928 |
+
.speech-bubble.active-dialogue * {
|
| 929 |
+
color: #141413 !important;
|
| 930 |
+
}
|
| 931 |
+
|
| 932 |
+
.speech-bubble.active-dialogue::before,
|
| 933 |
+
.speech-bubble.active-dialogue::after {
|
| 934 |
content: "";
|
| 935 |
position: absolute;
|
| 936 |
+
left: var(--bubble-tail-x, 50%);
|
| 937 |
+
display: block;
|
|
|
|
|
|
|
| 938 |
transform: translateX(-50%) rotate(45deg);
|
| 939 |
+
}
|
| 940 |
+
|
| 941 |
+
.speech-bubble.active-dialogue::before {
|
| 942 |
+
bottom: -13px;
|
| 943 |
+
width: 22px;
|
| 944 |
+
height: 22px;
|
| 945 |
+
background: #141413;
|
| 946 |
+
border-radius: 0 0 5px 0;
|
| 947 |
+
}
|
| 948 |
+
|
| 949 |
+
.speech-bubble.active-dialogue::after {
|
| 950 |
+
bottom: -9px;
|
| 951 |
+
width: 16px;
|
| 952 |
+
height: 16px;
|
| 953 |
+
transform: translateX(-50%) rotate(45deg);
|
| 954 |
+
background: rgba(255, 253, 247, .97);
|
| 955 |
+
border-radius: 0 0 3px 0;
|
| 956 |
+
}
|
| 957 |
+
|
| 958 |
+
.speech-bubble.active-dialogue.pending {
|
| 959 |
+
opacity: .82;
|
| 960 |
+
}
|
| 961 |
+
|
| 962 |
+
.dialogue-meta {
|
| 963 |
+
display: flex;
|
| 964 |
+
align-items: baseline;
|
| 965 |
+
gap: 6px;
|
| 966 |
+
margin-bottom: 5px;
|
| 967 |
+
font: 800 9px/1.2 ui-monospace, SFMono-Regular, Consolas, monospace;
|
| 968 |
+
text-transform: uppercase;
|
| 969 |
+
}
|
| 970 |
+
|
| 971 |
+
.dialogue-meta strong {
|
| 972 |
+
font-size: 10px;
|
| 973 |
+
}
|
| 974 |
+
|
| 975 |
+
.dialogue-text {
|
| 976 |
+
max-height: calc(34vh - 42px);
|
| 977 |
+
overflow: auto;
|
| 978 |
+
white-space: pre-wrap;
|
| 979 |
+
}
|
| 980 |
+
|
| 981 |
+
.speech-bubble.active-dialogue.speaker-clerk { left: 43%; top: 62%; }
|
| 982 |
+
.speech-bubble.active-dialogue.speaker-judge { left: 50%; top: 43%; }
|
| 983 |
+
.speech-bubble.active-dialogue.speaker-auric { left: 43%; top: 78%; }
|
| 984 |
+
.speech-bubble.active-dialogue.speaker-sable { left: 75%; top: 78%; }
|
| 985 |
+
.speech-bubble.active-dialogue.juror-dialogue { left: 50%; top: 57%; }
|
| 986 |
+
.speech-bubble.active-dialogue.juror-dialogue {
|
| 987 |
+
top: 42%;
|
| 988 |
+
width: min(340px, calc(50vw - 24px));
|
| 989 |
+
}
|
| 990 |
+
|
| 991 |
+
.speech-bubble.active-dialogue.speaker-karl-marx,
|
| 992 |
+
.speech-bubble.active-dialogue.speaker-john-stuart-mill,
|
| 993 |
+
.speech-bubble.active-dialogue.speaker-confucius {
|
| 994 |
+
left: 1.5%;
|
| 995 |
+
transform: translateY(-100%);
|
| 996 |
+
}
|
| 997 |
+
|
| 998 |
+
.speech-bubble.active-dialogue.speaker-cleopatra-vii,
|
| 999 |
+
.speech-bubble.active-dialogue.speaker-niccolo-machiavelli,
|
| 1000 |
+
.speech-bubble.active-dialogue.speaker-jensen-huang {
|
| 1001 |
+
right: 1.5%;
|
| 1002 |
+
left: auto;
|
| 1003 |
+
transform: translateY(-100%);
|
| 1004 |
+
}
|
| 1005 |
+
|
| 1006 |
+
.speech-bubble.active-dialogue.speaker-karl-marx,
|
| 1007 |
+
.speech-bubble.active-dialogue.speaker-cleopatra-vii {
|
| 1008 |
+
--bubble-tail-x: 19%;
|
| 1009 |
+
}
|
| 1010 |
+
|
| 1011 |
+
.speech-bubble.active-dialogue.speaker-john-stuart-mill,
|
| 1012 |
+
.speech-bubble.active-dialogue.speaker-niccolo-machiavelli {
|
| 1013 |
+
--bubble-tail-x: 50%;
|
| 1014 |
+
}
|
| 1015 |
+
|
| 1016 |
+
.speech-bubble.active-dialogue.speaker-confucius,
|
| 1017 |
+
.speech-bubble.active-dialogue.speaker-jensen-huang {
|
| 1018 |
+
--bubble-tail-x: 81%;
|
| 1019 |
+
}
|
| 1020 |
+
|
| 1021 |
+
.verdict-popup {
|
| 1022 |
+
position: absolute;
|
| 1023 |
+
left: 50%;
|
| 1024 |
+
top: 54%;
|
| 1025 |
+
z-index: 42;
|
| 1026 |
+
width: min(460px, calc(100vw - 44px));
|
| 1027 |
+
transform: translate(-50%, -50%);
|
| 1028 |
+
padding: 18px 22px 20px;
|
| 1029 |
+
border: 2px solid rgba(255, 235, 178, .94);
|
| 1030 |
+
border-radius: 8px;
|
| 1031 |
+
background: rgba(20, 12, 7, .95);
|
| 1032 |
+
color: #fff4d6;
|
| 1033 |
+
text-align: center;
|
| 1034 |
+
box-shadow: 0 28px 58px rgba(0, 0, 0, .5);
|
| 1035 |
+
animation: verdict-pop .34s ease-out both;
|
| 1036 |
+
}
|
| 1037 |
+
|
| 1038 |
+
.verdict-popup-kicker {
|
| 1039 |
+
display: block;
|
| 1040 |
+
margin-bottom: 7px;
|
| 1041 |
+
color: #d9b060;
|
| 1042 |
+
font: 800 11px/1 ui-monospace, SFMono-Regular, Consolas, monospace;
|
| 1043 |
+
letter-spacing: 0;
|
| 1044 |
+
text-transform: uppercase;
|
| 1045 |
+
}
|
| 1046 |
+
|
| 1047 |
+
.verdict-popup-finding {
|
| 1048 |
+
display: block;
|
| 1049 |
+
color: #fff8e6;
|
| 1050 |
+
font: 900 clamp(28px, 5vw, 48px)/1.02 Georgia, serif;
|
| 1051 |
+
}
|
| 1052 |
+
|
| 1053 |
+
.verdict-popup-decree {
|
| 1054 |
+
margin: 10px auto 0;
|
| 1055 |
+
max-width: 38ch;
|
| 1056 |
+
color: rgba(255, 244, 214, .86);
|
| 1057 |
+
font-size: 13px;
|
| 1058 |
+
line-height: 1.35;
|
| 1059 |
}
|
| 1060 |
|
| 1061 |
.tooltip {
|
|
|
|
| 1190 |
animation: juror-react .82s ease-in-out infinite alternate;
|
| 1191 |
}
|
| 1192 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1193 |
.juror-face {
|
| 1194 |
position: absolute;
|
| 1195 |
left: 50%;
|
|
|
|
| 1449 |
100% { transform: rotate(-18deg) translateY(0); }
|
| 1450 |
}
|
| 1451 |
|
| 1452 |
+
@keyframes verdict-pop {
|
| 1453 |
+
0% {
|
| 1454 |
+
opacity: 0;
|
| 1455 |
+
transform: translate(-50%, -46%) scale(.94);
|
| 1456 |
+
}
|
| 1457 |
+
100% {
|
| 1458 |
+
opacity: 1;
|
| 1459 |
+
transform: translate(-50%, -50%) scale(1);
|
| 1460 |
+
}
|
| 1461 |
+
}
|
| 1462 |
+
|
| 1463 |
@media (max-width: 820px) {
|
| 1464 |
.docket-book-controls {
|
| 1465 |
position: fixed;
|
| 1466 |
+
top: 130px;
|
| 1467 |
width: calc(100vw - 52px);
|
| 1468 |
transform: translateX(-50%) rotate(-1deg);
|
| 1469 |
}
|
| 1470 |
|
| 1471 |
+
.trial-progress {
|
| 1472 |
+
grid-template-columns: repeat(8, minmax(24px, 1fr));
|
| 1473 |
+
padding: 2px 5px 3px;
|
| 1474 |
+
}
|
| 1475 |
+
|
| 1476 |
+
.trial-progress-segment {
|
| 1477 |
+
font-size: 9px;
|
| 1478 |
+
letter-spacing: 0;
|
| 1479 |
+
}
|
| 1480 |
+
|
| 1481 |
+
.trial-progress-label {
|
| 1482 |
+
display: none;
|
| 1483 |
+
}
|
| 1484 |
+
|
| 1485 |
+
.trial-progress-abbrev {
|
| 1486 |
+
display: inline;
|
| 1487 |
+
}
|
| 1488 |
+
|
| 1489 |
.court-episode-stage {
|
| 1490 |
height: 1280px;
|
| 1491 |
min-height: 1280px;
|
|
|
|
| 1508 |
max-width: calc(100% - 32px);
|
| 1509 |
}
|
| 1510 |
|
| 1511 |
+
.speech-bubble.active-dialogue,
|
| 1512 |
+
.speech-bubble.active-dialogue.speaker-clerk,
|
| 1513 |
+
.speech-bubble.active-dialogue.speaker-judge,
|
| 1514 |
+
.speech-bubble.active-dialogue.speaker-auric,
|
| 1515 |
+
.speech-bubble.active-dialogue.speaker-sable,
|
| 1516 |
+
.speech-bubble.active-dialogue.juror-dialogue {
|
| 1517 |
+
left: 50%;
|
| 1518 |
+
top: 218px;
|
| 1519 |
+
width: calc(100% - 28px);
|
| 1520 |
+
max-height: 260px;
|
| 1521 |
+
transform: translateX(-50%);
|
| 1522 |
+
}
|
| 1523 |
+
|
| 1524 |
+
.speech-bubble.active-dialogue::after {
|
| 1525 |
+
display: none;
|
| 1526 |
+
}
|
| 1527 |
+
|
| 1528 |
+
.speech-bubble.active-dialogue.juror-dialogue,
|
| 1529 |
+
.speech-bubble.active-dialogue.speaker-karl-marx,
|
| 1530 |
+
.speech-bubble.active-dialogue.speaker-john-stuart-mill,
|
| 1531 |
+
.speech-bubble.active-dialogue.speaker-confucius,
|
| 1532 |
+
.speech-bubble.active-dialogue.speaker-cleopatra-vii,
|
| 1533 |
+
.speech-bubble.active-dialogue.speaker-niccolo-machiavelli,
|
| 1534 |
+
.speech-bubble.active-dialogue.speaker-jensen-huang {
|
| 1535 |
+
top: 500px;
|
| 1536 |
+
width: min(320px, calc(100vw - 28px));
|
| 1537 |
+
transform: translateY(-100%);
|
| 1538 |
+
}
|
| 1539 |
+
|
| 1540 |
+
.speech-bubble.active-dialogue.speaker-karl-marx,
|
| 1541 |
+
.speech-bubble.active-dialogue.speaker-john-stuart-mill,
|
| 1542 |
+
.speech-bubble.active-dialogue.speaker-confucius {
|
| 1543 |
+
left: 14px;
|
| 1544 |
+
right: auto;
|
| 1545 |
+
}
|
| 1546 |
+
|
| 1547 |
+
.speech-bubble.active-dialogue.speaker-cleopatra-vii,
|
| 1548 |
+
.speech-bubble.active-dialogue.speaker-niccolo-machiavelli,
|
| 1549 |
+
.speech-bubble.active-dialogue.speaker-jensen-huang {
|
| 1550 |
+
right: 14px;
|
| 1551 |
+
left: auto;
|
| 1552 |
+
}
|
| 1553 |
+
|
| 1554 |
.episode-book {
|
| 1555 |
+
top: 218px;
|
| 1556 |
width: min(680px, calc(100% - 20px));
|
| 1557 |
}
|
| 1558 |
|
| 1559 |
.episode-book.closed {
|
| 1560 |
+
top: 640px;
|
| 1561 |
+
width: 140px;
|
| 1562 |
}
|
| 1563 |
|
| 1564 |
.book-open-content {
|
| 1565 |
grid-template-columns: 1fr;
|
| 1566 |
gap: 10px;
|
| 1567 |
+
inset: 15% 11% 13%;
|
| 1568 |
+
padding: 0 16px;
|
| 1569 |
}
|
| 1570 |
|
| 1571 |
.book-open-content h2 {
|
|
|
|
| 1583 |
margin: 5px 0;
|
| 1584 |
}
|
| 1585 |
|
| 1586 |
+
.book-evidence-columns {
|
| 1587 |
+
grid-template-columns: 1fr 1fr;
|
| 1588 |
+
gap: 8px;
|
| 1589 |
+
}
|
| 1590 |
+
|
| 1591 |
+
.book-evidence-list li {
|
| 1592 |
+
font-size: 10px;
|
| 1593 |
+
line-height: 1.12;
|
| 1594 |
+
}
|
| 1595 |
+
|
| 1596 |
+
.book-field {
|
| 1597 |
+
min-height: 34px;
|
| 1598 |
+
font-size: 10px;
|
| 1599 |
+
}
|
| 1600 |
+
|
| 1601 |
+
.book-context-field {
|
| 1602 |
+
min-height: 84px;
|
| 1603 |
+
}
|
| 1604 |
+
|
| 1605 |
.judge-dais {
|
| 1606 |
top: 390px;
|
| 1607 |
width: 280px;
|
|
|
|
| 1623 |
|
| 1624 |
.puppet.auric {
|
| 1625 |
left: 20%;
|
| 1626 |
+
top: 970px;
|
| 1627 |
}
|
| 1628 |
|
| 1629 |
.puppet.sable {
|
| 1630 |
left: 80%;
|
| 1631 |
+
top: 970px;
|
| 1632 |
}
|
| 1633 |
|
| 1634 |
.speaker-auric .puppet.auric {
|
| 1635 |
left: 42%;
|
| 1636 |
+
top: 970px;
|
| 1637 |
}
|
| 1638 |
|
| 1639 |
.speaker-sable .puppet.sable {
|
| 1640 |
+
left: 80%;
|
| 1641 |
+
top: 970px;
|
| 1642 |
}
|
| 1643 |
|
| 1644 |
.puppet.clerk {
|
| 1645 |
left: 35%;
|
| 1646 |
+
top: 880px;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1647 |
}
|
| 1648 |
|
| 1649 |
.witness-area {
|
|
|
|
| 1659 |
}
|
| 1660 |
|
| 1661 |
.jury-benches.left {
|
| 1662 |
+
left: .5%;
|
| 1663 |
}
|
| 1664 |
|
| 1665 |
.jury-benches.right {
|
| 1666 |
+
right: .5%;
|
| 1667 |
}
|
| 1668 |
|
| 1669 |
.foreground-fence {
|
| 1670 |
+
bottom: -66px;
|
| 1671 |
width: 64%;
|
| 1672 |
}
|
| 1673 |
|
|
|
|
| 1680 |
}
|
| 1681 |
|
| 1682 |
.judge-table-foreground {
|
| 1683 |
+
top: 213px;
|
| 1684 |
+
width: 646px;
|
| 1685 |
}
|
| 1686 |
|
| 1687 |
.evidence-props {
|
|
|
|
| 1870 |
"""
|
| 1871 |
|
| 1872 |
START_JS = """
|
| 1873 |
+
(case_label, search_query, hypothetical, custom_payload, speed, mind_layer) => {
|
| 1874 |
+
const book = document.querySelector('.episode-book.custom-book');
|
| 1875 |
+
const collect = (selector) => Array.from(document.querySelectorAll(selector)).map((node) => node.value || '');
|
| 1876 |
+
const payload = book ? JSON.stringify({
|
| 1877 |
+
context: document.querySelector('.book-context-field')?.value || '',
|
| 1878 |
+
claimant_evidence: collect('.book-claimant-field'),
|
| 1879 |
+
respondent_evidence: collect('.book-respondent-field')
|
| 1880 |
+
}) : (custom_payload || '');
|
| 1881 |
+
if (book) {
|
| 1882 |
+
const data = JSON.parse(payload);
|
| 1883 |
+
const hasContext = data.context.trim().length > 0;
|
| 1884 |
+
const hasClaimant = data.claimant_evidence.some((value) => value.trim().length > 0);
|
| 1885 |
+
const hasRespondent = data.respondent_evidence.some((value) => value.trim().length > 0);
|
| 1886 |
+
if (!hasContext || !hasClaimant || !hasRespondent) {
|
| 1887 |
+
return [case_label, search_query, hypothetical, payload, speed, mind_layer];
|
| 1888 |
+
}
|
| 1889 |
+
}
|
| 1890 |
document.body.classList.add('trial-has-started');
|
| 1891 |
if (window.SovereignCourtAudio) {
|
| 1892 |
window.SovereignCourtAudio.begin();
|
| 1893 |
}
|
| 1894 |
+
return [case_label, search_query, hypothetical, payload, speed, mind_layer];
|
| 1895 |
}
|
| 1896 |
"""
|
| 1897 |
|
|
|
|
| 1909 |
"role": "Court clerk",
|
| 1910 |
"model": "AgentCPM-Explore",
|
| 1911 |
},
|
| 1912 |
+
"Mike OSS": {
|
| 1913 |
"class": "auric",
|
| 1914 |
+
"name": "Mike OSS",
|
| 1915 |
"role": "Claimant advocate",
|
| 1916 |
"model": "gpt-oss-20b",
|
| 1917 |
},
|
| 1918 |
+
"Harvey Vector": {
|
| 1919 |
"class": "sable",
|
| 1920 |
+
"name": "Harvey Vector",
|
| 1921 |
"role": "Respondent advocate",
|
| 1922 |
"model": "gpt-oss-20b",
|
| 1923 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1924 |
"Nemotron Jury": {
|
| 1925 |
"class": "jury",
|
| 1926 |
"name": "Nemotron Jury",
|
|
|
|
| 1947 |
"Jensen Huang": "/gradio_api/file=assets/characters/jensen-huang.png",
|
| 1948 |
}
|
| 1949 |
|
| 1950 |
+
TRIAL_TURN_ORDER = (
|
| 1951 |
+
"Clerk Meridian",
|
| 1952 |
+
JUDGE_NAME,
|
| 1953 |
+
"Mike OSS",
|
| 1954 |
+
"Harvey Vector",
|
| 1955 |
+
JUDGE_NAME,
|
| 1956 |
+
"Mike OSS",
|
| 1957 |
+
"Harvey Vector",
|
| 1958 |
+
"Nemotron Jury",
|
| 1959 |
+
*JUROR_PERSONAS.keys(),
|
| 1960 |
+
JUDGE_NAME,
|
| 1961 |
+
)
|
| 1962 |
+
|
| 1963 |
PHASE_AGENTS = {
|
| 1964 |
"pretrial": ["Clerk Meridian"],
|
| 1965 |
}
|
| 1966 |
|
| 1967 |
|
| 1968 |
+
@dataclass(frozen=True)
|
| 1969 |
+
class SpeakerCue:
|
| 1970 |
+
name: str
|
| 1971 |
+
role: str
|
| 1972 |
+
text: str
|
| 1973 |
+
pending: bool = False
|
| 1974 |
+
|
| 1975 |
+
|
| 1976 |
+
_EVENT_STREAM_DONE = object()
|
| 1977 |
+
|
| 1978 |
+
|
| 1979 |
def _remote_events(request: TrialRequest) -> Iterable[TrialEvent] | None:
|
| 1980 |
+
endpoint = os.getenv("MODAL_TRIAL_URL", DEFAULT_MODAL_TRIAL_URL).strip()
|
| 1981 |
if not endpoint:
|
| 1982 |
return None
|
| 1983 |
|
|
|
|
| 1991 |
return iterator()
|
| 1992 |
|
| 1993 |
|
| 1994 |
+
def get_events(request: TrialRequest, delay: float | None = None) -> Iterable[TrialEvent]:
|
| 1995 |
remote = _remote_events(request)
|
| 1996 |
if remote is not None:
|
| 1997 |
yield from remote
|
| 1998 |
return
|
| 1999 |
+
stream_delay = {"swift": 1.4, "measured": 2.4, "ceremonial": 3.4}[request.speed] if delay is None else delay
|
| 2000 |
+
yield from stream_trial(request, delay=stream_delay)
|
| 2001 |
|
| 2002 |
|
| 2003 |
def _escape(value: str) -> str:
|
|
|
|
| 2037 |
return event.turns[0].agent
|
| 2038 |
|
| 2039 |
|
| 2040 |
+
def _role_for_speaker(name: str, event: TrialEvent | None = None) -> str:
|
| 2041 |
+
if event is not None:
|
| 2042 |
+
turn = next((turn for turn in event.turns if turn.agent == name), None)
|
| 2043 |
+
if turn is not None:
|
| 2044 |
+
return turn.role
|
| 2045 |
+
if name in CHARACTERS:
|
| 2046 |
+
return CHARACTERS[name]["role"]
|
| 2047 |
+
if name in JUROR_FACES:
|
| 2048 |
+
return "juror"
|
| 2049 |
+
return "speaker"
|
| 2050 |
+
|
| 2051 |
+
|
| 2052 |
+
def _expected_next_speaker(events: list[TrialEvent]) -> SpeakerCue | None:
|
| 2053 |
+
if len(events) >= len(TRIAL_TURN_ORDER):
|
| 2054 |
+
return None
|
| 2055 |
+
name = TRIAL_TURN_ORDER[len(events)]
|
| 2056 |
+
role = _role_for_speaker(name)
|
| 2057 |
+
return SpeakerCue(name=name, role=role, text=f"{name} is preparing a response.", pending=True)
|
| 2058 |
+
|
| 2059 |
+
|
| 2060 |
def _speaker_class_for(speaker: str) -> str:
|
| 2061 |
if not speaker:
|
| 2062 |
return ""
|
|
|
|
| 2074 |
return _short_text(turn.content, 210)
|
| 2075 |
|
| 2076 |
|
| 2077 |
+
def _active_speaker_cue(event: TrialEvent | None, pending_speaker: SpeakerCue | None = None) -> SpeakerCue | None:
|
| 2078 |
+
if pending_speaker is not None:
|
| 2079 |
+
return pending_speaker
|
| 2080 |
+
if event is None or not event.turns:
|
| 2081 |
+
return None
|
| 2082 |
+
turn = event.turns[0]
|
| 2083 |
+
text = turn.content.strip()
|
| 2084 |
+
if not text:
|
| 2085 |
+
return None
|
| 2086 |
+
return SpeakerCue(name=turn.agent, role=turn.role, text=text)
|
| 2087 |
+
|
| 2088 |
+
|
| 2089 |
+
def _reading_duration(text: str) -> float:
|
| 2090 |
+
word_count = len(text.split())
|
| 2091 |
+
return min(MAX_READ_SECONDS, max(MIN_READ_SECONDS, (word_count / WORDS_PER_SECOND) + READ_BUFFER_SECONDS))
|
| 2092 |
+
|
| 2093 |
+
|
| 2094 |
+
def _event_dialogue_text(event: TrialEvent) -> str:
|
| 2095 |
+
if event.turns:
|
| 2096 |
+
return event.turns[0].content
|
| 2097 |
+
return event.body
|
| 2098 |
+
|
| 2099 |
+
|
| 2100 |
+
def _event_status(event: TrialEvent, step: int) -> str:
|
| 2101 |
+
if event.turns:
|
| 2102 |
+
return f"Step {step}: {event.turns[0].agent} - {event.title}"
|
| 2103 |
+
return f"Step {step}: {event.title}"
|
| 2104 |
+
|
| 2105 |
+
|
| 2106 |
+
def _pending_status(cue: SpeakerCue | None) -> str:
|
| 2107 |
+
if cue is None:
|
| 2108 |
+
return "The court is preparing the next turn."
|
| 2109 |
+
return f"{cue.name} is preparing their response."
|
| 2110 |
+
|
| 2111 |
+
|
| 2112 |
+
def _start_event_producer(request: TrialRequest) -> queue.Queue[object]:
|
| 2113 |
+
events: queue.Queue[object] = queue.Queue()
|
| 2114 |
+
|
| 2115 |
+
def produce() -> None:
|
| 2116 |
+
try:
|
| 2117 |
+
try:
|
| 2118 |
+
stream = get_events(request, delay=0.0)
|
| 2119 |
+
except TypeError:
|
| 2120 |
+
stream = get_events(request)
|
| 2121 |
+
for event in stream:
|
| 2122 |
+
events.put(event)
|
| 2123 |
+
except Exception as exc:
|
| 2124 |
+
events.put(exc)
|
| 2125 |
+
finally:
|
| 2126 |
+
events.put(_EVENT_STREAM_DONE)
|
| 2127 |
+
|
| 2128 |
+
threading.Thread(target=produce, name="trial-event-producer", daemon=True).start()
|
| 2129 |
+
return events
|
| 2130 |
+
|
| 2131 |
+
|
| 2132 |
def _thread_id(name: str) -> str:
|
| 2133 |
return "ai-thread-" + "".join(ch.lower() if ch.isalnum() else "-" for ch in name).strip("-")
|
| 2134 |
|
|
|
|
| 2216 |
)
|
| 2217 |
|
| 2218 |
|
| 2219 |
+
def _active_dialogue(cue: SpeakerCue | None) -> str:
|
| 2220 |
+
if cue is None:
|
| 2221 |
+
return ""
|
| 2222 |
+
speaker_cls = _speaker_class_for(cue.name).strip()
|
| 2223 |
+
classes = ["speech-bubble", "active-dialogue"]
|
| 2224 |
+
if speaker_cls:
|
| 2225 |
+
classes.append(speaker_cls)
|
| 2226 |
+
if cue.name in JUROR_FACES:
|
| 2227 |
+
classes.append("juror-dialogue")
|
| 2228 |
+
if cue.pending:
|
| 2229 |
+
classes.append("pending")
|
| 2230 |
+
pending_attr = " data-pending='true'" if cue.pending else ""
|
| 2231 |
+
return (
|
| 2232 |
+
f"<div class='{' '.join(classes)}' data-speaker='{_escape(cue.name)}'{pending_attr}>"
|
| 2233 |
+
"<div class='dialogue-meta'>"
|
| 2234 |
+
f"<strong>{_escape(cue.name)}</strong>"
|
| 2235 |
+
f"<span>{_escape(cue.role)}</span>"
|
| 2236 |
+
"</div>"
|
| 2237 |
+
f"<div class='dialogue-text'>{_escape(cue.text)}</div>"
|
| 2238 |
+
"</div>"
|
| 2239 |
+
)
|
| 2240 |
+
|
| 2241 |
+
|
| 2242 |
+
def _verdict_popup(events: list[TrialEvent], show: bool) -> str:
|
| 2243 |
+
if not show:
|
| 2244 |
+
return ""
|
| 2245 |
+
verdict = next((event.verdict for event in reversed(events) if event.verdict is not None), None)
|
| 2246 |
+
if verdict is None:
|
| 2247 |
+
return ""
|
| 2248 |
+
finding = VERDICT_LABELS.get(verdict.finding, verdict.finding.replace("_", " ").title())
|
| 2249 |
+
return (
|
| 2250 |
+
f"<div class='verdict-popup' role='alert' aria-live='assertive' data-finding='{_escape(verdict.finding)}'>"
|
| 2251 |
+
"<span class='verdict-popup-kicker'>Verdict</span>"
|
| 2252 |
+
f"<strong class='verdict-popup-finding'>Verdict: {_escape(finding)}</strong>"
|
| 2253 |
+
f"<p class='verdict-popup-decree'>{_escape(verdict.decree)}</p>"
|
| 2254 |
+
"</div>"
|
| 2255 |
+
)
|
| 2256 |
+
|
| 2257 |
+
|
| 2258 |
def _puppet(agent: str, active_agents: set[str], phase: str, events: list[TrialEvent], latest: TrialEvent | None) -> str:
|
| 2259 |
meta = CHARACTERS[agent]
|
| 2260 |
active = " active" if agent in active_agents else ""
|
| 2261 |
+
walking = " walking" if agent in {"Mike OSS", "Harvey Vector"} and agent in active_agents else ""
|
| 2262 |
+
small = " small" if agent == "Clerk Meridian" else ""
|
| 2263 |
turns = _thread_for_character(events, agent)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2264 |
portrait = ""
|
| 2265 |
if meta.get("image"):
|
| 2266 |
portrait = (
|
|
|
|
| 2271 |
f"<a class='puppet {meta['class']}{active}{walking}{small}' href='#{_escape(_thread_id(agent))}' aria-label='Open {_escape(agent)} model thread'>"
|
| 2272 |
f"{portrait}"
|
| 2273 |
"<span class='mouth'></span>"
|
|
|
|
| 2274 |
f"{_tooltip(meta['name'], meta['role'], meta['model'], turns)}"
|
| 2275 |
"</a>"
|
| 2276 |
)
|
|
|
|
| 2281 |
image = JUROR_IMAGES.get(name, "")
|
| 2282 |
active_cls = " active" if active else ""
|
| 2283 |
turns = _thread_for_character(events or [], name)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2284 |
portrait = (
|
| 2285 |
f"<img class='juror-portrait' src='{_escape(image)}' alt='{_escape(name)} bust' "
|
| 2286 |
"onerror=\"this.style.display='none'\">"
|
| 2287 |
if image
|
| 2288 |
else ""
|
| 2289 |
)
|
| 2290 |
+
fallback_art = "" if image else "<span class='juror-face'></span><span class='juror-body'></span>"
|
| 2291 |
return (
|
| 2292 |
f"<a class='juror{active_cls}' href='#{_escape(_thread_id(name))}' style='--face: {face}' aria-label='Open {_escape(name)} model thread'>"
|
| 2293 |
f"{portrait}"
|
| 2294 |
+
f"{fallback_art}"
|
|
|
|
| 2295 |
f"{_tooltip(name, 'HF-style juror', 'Nemotron panel', turns)}"
|
| 2296 |
"</a>"
|
| 2297 |
)
|
| 2298 |
|
| 2299 |
|
| 2300 |
+
def _packet_for_label(case_label: str) -> CasePacket:
|
| 2301 |
+
return get_case(CASE_OPTIONS.get(case_label, "socrates"))
|
| 2302 |
+
|
| 2303 |
+
|
| 2304 |
+
def _split_evidence(packet: CasePacket) -> tuple[list[EvidenceItem], list[EvidenceItem]]:
|
| 2305 |
+
claimant = [item for item in packet.evidence if item.supports == "claimant"]
|
| 2306 |
+
respondent = [item for item in packet.evidence if item.supports == "respondent"]
|
| 2307 |
+
if len(claimant) < 3:
|
| 2308 |
+
claimant.extend(item for item in packet.evidence if item.supports in {"mixed", "context"} and item not in claimant)
|
| 2309 |
+
if len(respondent) < 3:
|
| 2310 |
+
respondent.extend(item for item in packet.evidence if item.supports in {"mixed", "context"} and item not in respondent)
|
| 2311 |
+
return claimant[:3], respondent[:3]
|
| 2312 |
+
|
| 2313 |
+
|
| 2314 |
+
def _book_evidence_column(title: str, items: list[EvidenceItem]) -> str:
|
| 2315 |
+
entries = "".join(
|
| 2316 |
+
"<li>"
|
| 2317 |
+
f"<strong>{_escape(item.title)}</strong><br>"
|
| 2318 |
+
f"{_escape(item.note)}"
|
| 2319 |
+
"</li>"
|
| 2320 |
+
for item in items
|
| 2321 |
+
)
|
| 2322 |
+
return (
|
| 2323 |
+
"<section class='book-evidence-column'>"
|
| 2324 |
+
f"<h3>{_escape(title)}</h3>"
|
| 2325 |
+
f"<ul class='book-evidence-list'>{entries}</ul>"
|
| 2326 |
+
"</section>"
|
| 2327 |
+
)
|
| 2328 |
+
|
| 2329 |
+
|
| 2330 |
+
def _custom_evidence_fields(class_name: str, label: str) -> str:
|
| 2331 |
+
fields = "".join(
|
| 2332 |
+
f"<textarea class='book-field {class_name}' aria-label='{_escape(label)} {index}' "
|
| 2333 |
+
f"placeholder='{_escape(label)} {index}'></textarea>"
|
| 2334 |
+
for index in range(1, 4)
|
| 2335 |
+
)
|
| 2336 |
+
return f"<section class='book-evidence-column'><h3>{_escape(label)}</h3>{fields}</section>"
|
| 2337 |
+
|
| 2338 |
+
|
| 2339 |
+
def _book(open_book: bool, packet: CasePacket | None = None, custom_mode: bool = False) -> str:
|
| 2340 |
closed = "" if open_book else " closed"
|
| 2341 |
+
custom_class = " custom-book" if custom_mode and open_book else ""
|
| 2342 |
+
hidden_attr = "" if custom_mode and open_book else " aria-hidden='true'"
|
| 2343 |
+
packet = packet or get_case("socrates")
|
| 2344 |
+
if custom_mode and open_book:
|
| 2345 |
+
left_page = (
|
| 2346 |
+
"<section><h2>Trial details</h2>"
|
| 2347 |
+
"<textarea class='book-field book-context-field' aria-label='Custom trial details' "
|
| 2348 |
+
"placeholder='Write a short paragraph describing what happened and why the court is hearing it.'></textarea>"
|
| 2349 |
+
"</section>"
|
| 2350 |
+
)
|
| 2351 |
+
right_page = (
|
| 2352 |
+
"<section><h2>Evidence</h2><div class='book-evidence-columns'>"
|
| 2353 |
+
f"{_custom_evidence_fields('book-claimant-field', 'Evidence for Claimant')}"
|
| 2354 |
+
f"{_custom_evidence_fields('book-respondent-field', 'Evidence against Claimant')}"
|
| 2355 |
+
"</div></section>"
|
| 2356 |
+
)
|
| 2357 |
+
else:
|
| 2358 |
+
claimant_evidence, respondent_evidence = _split_evidence(packet)
|
| 2359 |
+
left_page = (
|
| 2360 |
+
"<section><h2>Trial details</h2>"
|
| 2361 |
+
f"<p class='book-case-title'>{_escape(packet.title)}</p>"
|
| 2362 |
+
f"<p class='book-context'>{_escape(packet.context or packet.setting)}</p>"
|
| 2363 |
+
f"<div class='book-entry'><strong>{_escape(packet.claimant)}</strong><br>{_escape(packet.claimant_claim)}</div>"
|
| 2364 |
+
f"<div class='book-entry'><strong>{_escape(packet.respondent)}</strong><br>{_escape(packet.respondent_claim)}</div>"
|
| 2365 |
+
"</section>"
|
| 2366 |
+
)
|
| 2367 |
+
right_page = (
|
| 2368 |
+
"<section><h2>Evidence</h2><div class='book-evidence-columns'>"
|
| 2369 |
+
f"{_book_evidence_column(f'Evidence for {packet.claimant}', claimant_evidence)}"
|
| 2370 |
+
f"{_book_evidence_column(f'Evidence for {packet.respondent}', respondent_evidence)}"
|
| 2371 |
+
"</div></section>"
|
| 2372 |
+
)
|
| 2373 |
return (
|
| 2374 |
+
f"<div class='episode-book{closed}{custom_class}'>"
|
| 2375 |
"<img class='book-art open-art' src='/gradio_api/file=assets/book/docket-book-open.png' alt='Open docket book'>"
|
| 2376 |
"<img class='book-art closed-art' src='/gradio_api/file=assets/book/docket-book-closed.png' alt='Closed docket book'>"
|
| 2377 |
+
f"<div class='book-open-content'{hidden_attr}>"
|
| 2378 |
+
f"{left_page}"
|
| 2379 |
+
f"{right_page}"
|
| 2380 |
+
"</div>"
|
| 2381 |
"</div>"
|
| 2382 |
)
|
| 2383 |
|
|
|
|
| 2420 |
)
|
| 2421 |
|
| 2422 |
|
| 2423 |
+
def _trial_progress(events: list[TrialEvent]) -> str:
|
| 2424 |
+
latest = events[-1] if events else None
|
| 2425 |
+
current_phase = latest.phase if latest else "pretrial"
|
| 2426 |
+
stage_keys = [key for key, _label in TRIAL_PROGRESS_STAGES]
|
| 2427 |
+
current_index = stage_keys.index(current_phase) if current_phase in stage_keys else None
|
| 2428 |
+
segments = []
|
| 2429 |
+
for index, (key, label) in enumerate(TRIAL_PROGRESS_STAGES):
|
| 2430 |
+
classes = ["trial-progress-segment"]
|
| 2431 |
+
attrs = [f"data-phase='{_escape(key)}'"]
|
| 2432 |
+
if current_index is not None and index < current_index:
|
| 2433 |
+
classes.append("complete")
|
| 2434 |
+
if current_index == index:
|
| 2435 |
+
classes.append("current")
|
| 2436 |
+
attrs.append("aria-current='step'")
|
| 2437 |
+
if key == "verdict":
|
| 2438 |
+
classes.append("complete")
|
| 2439 |
+
abbrev = label[:3]
|
| 2440 |
+
segments.append(
|
| 2441 |
+
f"<span class='{' '.join(classes)}' {' '.join(attrs)}>"
|
| 2442 |
+
f"<span class='trial-progress-label'>{_escape(label)}</span>"
|
| 2443 |
+
f"<span class='trial-progress-abbrev' aria-hidden='true'>{_escape(abbrev)}</span>"
|
| 2444 |
+
"</span>"
|
| 2445 |
+
)
|
| 2446 |
+
return (
|
| 2447 |
+
"<nav class='trial-progress' aria-label='Trial progress'>"
|
| 2448 |
+
+ "".join(segments)
|
| 2449 |
+
+ "</nav>"
|
| 2450 |
+
)
|
| 2451 |
+
|
| 2452 |
+
|
| 2453 |
def _courtroom_juror_names(votes: list) -> list[str]:
|
| 2454 |
names = list(JUROR_FACES)
|
| 2455 |
names.extend(vote.juror for vote in votes if vote.juror not in names)
|
|
|
|
| 2466 |
return ordered
|
| 2467 |
|
| 2468 |
|
| 2469 |
+
def render_court(
|
| 2470 |
+
events: list[TrialEvent],
|
| 2471 |
+
started: bool = False,
|
| 2472 |
+
pending_speaker: SpeakerCue | None = None,
|
| 2473 |
+
show_verdict_popup: bool = False,
|
| 2474 |
+
pretrial_case: CasePacket | None = None,
|
| 2475 |
+
custom_mode: bool = False,
|
| 2476 |
+
) -> str:
|
| 2477 |
latest = events[-1] if events else None
|
| 2478 |
phase = latest.phase if latest else "pretrial"
|
| 2479 |
title, subtitle = _latest_packet_title(events)
|
| 2480 |
+
active_cue = _active_speaker_cue(latest, pending_speaker)
|
| 2481 |
+
active_speaker = active_cue.name if active_cue is not None else _active_speaker_for(latest)
|
| 2482 |
+
active_agents = {active_speaker} if active_speaker else _active_agents_for(latest)
|
| 2483 |
speaker_cls = _speaker_class_for(active_speaker)
|
| 2484 |
caption_phase, caption_title, caption_body = _caption(latest, phase)
|
| 2485 |
latest_votes = _latest_votes(events)
|
|
|
|
| 2488 |
book_open = not started and not events
|
| 2489 |
puppets = "".join(
|
| 2490 |
_puppet(agent, active_agents, phase, events, latest)
|
| 2491 |
+
for agent in [JUDGE_NAME, "Clerk Meridian", "Mike OSS", "Harvey Vector"]
|
| 2492 |
)
|
| 2493 |
left_jurors = "".join(_juror(name, name == active_speaker, events, latest) for name in juror_names[:3])
|
| 2494 |
right_jurors = "".join(_juror(name, name == active_speaker, events, latest) for name in juror_names[3:6])
|
|
|
|
| 2502 |
)
|
| 2503 |
return (
|
| 2504 |
f"<section id='court-stage' class='court-episode-stage phase-{_escape(phase)}{_escape(speaker_cls)}{started_cls}' data-phase='{_escape(phase)}'>"
|
| 2505 |
+
f"{_trial_progress(events)}"
|
| 2506 |
"<div class='episode-room'></div>"
|
| 2507 |
"<div class='audio-deck' aria-hidden='true'>"
|
| 2508 |
+ "".join(f"<audio preload='auto' src='{_escape(src)}'></audio>" for src in AUDIO_PATHS.values())
|
|
|
|
| 2514 |
f"<h1>{_escape(title)}</h1>"
|
| 2515 |
f"<p>{_escape(subtitle)}</p></div>"
|
| 2516 |
f"<div class='decree-ribbon'>Step {len(events) if events else 0}: {caption_title}<br>Hover characters for agent and model details</div>"
|
| 2517 |
+
f"{_book(book_open, pretrial_case, custom_mode)}"
|
| 2518 |
f"<div class='judge-dais'><div class='prop-label'>{_escape(JUDGE_NAME)}</div><div class='bench-front'></div><span class='gavel'></span></div>"
|
| 2519 |
"<div class='counsel-table left'><div class='prop-label'>Claimant Table</div></div>"
|
| 2520 |
"<div class='counsel-table right'><div class='prop-label'>Respondent Table</div></div>"
|
|
|
|
| 2527 |
f"{puppets}"
|
| 2528 |
f"{evidence_props}"
|
| 2529 |
f"{_foreground_props()}"
|
| 2530 |
+
f"{_active_dialogue(active_cue)}"
|
| 2531 |
+
f"{_verdict_popup(events, show_verdict_popup)}"
|
| 2532 |
"<div class='gallery-benches'><div></div><div></div><div></div><div></div><div></div><div></div></div>"
|
| 2533 |
"<div class='trial-caption'>"
|
| 2534 |
f"<div class='caption-phase'>Live Trial Feed / {_escape(caption_phase)}</div>"
|
|
|
|
| 2599 |
return f"<pre class='mind-text'>{_escape(json.dumps(compact, indent=2))}</pre>"
|
| 2600 |
|
| 2601 |
|
| 2602 |
+
def _clean_custom_items(values: list[str]) -> list[str]:
|
| 2603 |
+
return [" ".join(value.split()) for value in values if " ".join(value.split())]
|
| 2604 |
+
|
| 2605 |
+
|
| 2606 |
+
def _custom_case_from_payload(payload: str) -> CasePacket:
|
| 2607 |
+
try:
|
| 2608 |
+
data = json.loads(payload or "{}")
|
| 2609 |
+
except json.JSONDecodeError as exc:
|
| 2610 |
+
raise ValueError("Custom case details could not be read from the docket book.") from exc
|
| 2611 |
+
context = " ".join(str(data.get("context", "")).split())
|
| 2612 |
+
claimant_items = _clean_custom_items([str(value) for value in data.get("claimant_evidence", [])])
|
| 2613 |
+
respondent_items = _clean_custom_items([str(value) for value in data.get("respondent_evidence", [])])
|
| 2614 |
+
if not context:
|
| 2615 |
+
raise ValueError("Custom requires a trial details paragraph.")
|
| 2616 |
+
if not claimant_items or not respondent_items:
|
| 2617 |
+
raise ValueError("Custom requires at least one evidence item for each side.")
|
| 2618 |
+
evidence = [
|
| 2619 |
+
EvidenceItem(
|
| 2620 |
+
id=f"CUS-F{index}",
|
| 2621 |
+
title=f"Claimant Evidence {index}",
|
| 2622 |
+
source="Custom docket entry",
|
| 2623 |
+
excerpt=item,
|
| 2624 |
+
supports="claimant",
|
| 2625 |
+
reliability=0.65,
|
| 2626 |
+
note=item,
|
| 2627 |
+
)
|
| 2628 |
+
for index, item in enumerate(claimant_items[:3], start=1)
|
| 2629 |
+
]
|
| 2630 |
+
evidence.extend(
|
| 2631 |
+
EvidenceItem(
|
| 2632 |
+
id=f"CUS-A{index}",
|
| 2633 |
+
title=f"Respondent Evidence {index}",
|
| 2634 |
+
source="Custom docket entry",
|
| 2635 |
+
excerpt=item,
|
| 2636 |
+
supports="respondent",
|
| 2637 |
+
reliability=0.65,
|
| 2638 |
+
note=item,
|
| 2639 |
+
)
|
| 2640 |
+
for index, item in enumerate(respondent_items[:3], start=1)
|
| 2641 |
+
)
|
| 2642 |
+
return CasePacket(
|
| 2643 |
+
id="custom",
|
| 2644 |
+
title="Custom Trial",
|
| 2645 |
+
subtitle="A custom docket assembled in the opening book.",
|
| 2646 |
+
claimant="Claimant",
|
| 2647 |
+
respondent="Respondent",
|
| 2648 |
+
charge="Whether the custom record supports the claimant or the respondent.",
|
| 2649 |
+
setting="A custom courtroom packet entered by the user.",
|
| 2650 |
+
context=context,
|
| 2651 |
+
claimant_claim="The claimant says the custom context and supporting evidence justify a favorable finding.",
|
| 2652 |
+
respondent_claim="The respondent says the custom context is incomplete, overread, or answered by contrary evidence.",
|
| 2653 |
+
source_note="Custom user-entered case packet from the docket book.",
|
| 2654 |
+
evidence=evidence,
|
| 2655 |
+
)
|
| 2656 |
+
|
| 2657 |
+
|
| 2658 |
+
def render_case_preview(case_label: str) -> str:
|
| 2659 |
+
case_id = CASE_OPTIONS.get(case_label, "socrates")
|
| 2660 |
+
return render_court(
|
| 2661 |
+
[],
|
| 2662 |
+
pretrial_case=get_case(case_id) if case_id != "custom" else None,
|
| 2663 |
+
custom_mode=case_id == "custom",
|
| 2664 |
+
)
|
| 2665 |
+
|
| 2666 |
+
|
| 2667 |
+
def run_ui(
|
| 2668 |
+
case_label: str,
|
| 2669 |
+
search_query: str,
|
| 2670 |
+
hypothetical: str,
|
| 2671 |
+
custom_payload: str,
|
| 2672 |
+
speed: str,
|
| 2673 |
+
mind_layer: bool,
|
| 2674 |
+
):
|
| 2675 |
+
case_id = CASE_OPTIONS.get(case_label, "socrates")
|
| 2676 |
+
try:
|
| 2677 |
+
custom_case = _custom_case_from_payload(custom_payload) if case_id == "custom" else None
|
| 2678 |
+
except ValueError as exc:
|
| 2679 |
+
yield (
|
| 2680 |
+
render_court([], pretrial_case=None, custom_mode=True),
|
| 2681 |
+
render_evidence([]),
|
| 2682 |
+
render_jurors([]),
|
| 2683 |
+
render_mind([], mind_layer),
|
| 2684 |
+
str(exc),
|
| 2685 |
+
)
|
| 2686 |
+
return
|
| 2687 |
request = TrialRequest(
|
| 2688 |
+
case_id=case_id,
|
| 2689 |
search_query=search_query or "",
|
| 2690 |
hypothetical=hypothetical or "",
|
| 2691 |
+
custom_case=custom_case,
|
| 2692 |
speed=speed or "swift",
|
| 2693 |
mind_layer=bool(mind_layer),
|
| 2694 |
)
|
| 2695 |
events: list[TrialEvent] = []
|
| 2696 |
+
produced_events = _start_event_producer(request)
|
| 2697 |
+
pending_speaker = _expected_next_speaker(events)
|
| 2698 |
yield (
|
| 2699 |
+
render_court(events, started=True, pending_speaker=pending_speaker),
|
| 2700 |
render_evidence(events),
|
| 2701 |
render_jurors(events),
|
| 2702 |
render_mind(events, mind_layer),
|
| 2703 |
+
_pending_status(pending_speaker),
|
| 2704 |
)
|
| 2705 |
try:
|
| 2706 |
+
while True:
|
| 2707 |
+
item = produced_events.get()
|
| 2708 |
+
if item is _EVENT_STREAM_DONE:
|
| 2709 |
+
break
|
| 2710 |
+
if isinstance(item, Exception):
|
| 2711 |
+
raise item
|
| 2712 |
+
event = item
|
| 2713 |
events.append(event)
|
|
|
|
| 2714 |
yield (
|
| 2715 |
render_court(events, started=True),
|
| 2716 |
render_evidence(events),
|
| 2717 |
render_jurors(events),
|
| 2718 |
render_mind(events, mind_layer),
|
| 2719 |
+
_event_status(event, len(events)),
|
| 2720 |
)
|
| 2721 |
+
duration = _reading_duration(_event_dialogue_text(event))
|
| 2722 |
+
if duration > 0:
|
| 2723 |
+
time.sleep(duration)
|
| 2724 |
+
pending_speaker = _expected_next_speaker(events)
|
| 2725 |
+
if pending_speaker is not None and produced_events.empty():
|
| 2726 |
+
yield (
|
| 2727 |
+
render_court(events, started=True, pending_speaker=pending_speaker),
|
| 2728 |
+
render_evidence(events),
|
| 2729 |
+
render_jurors(events),
|
| 2730 |
+
render_mind(events, mind_layer),
|
| 2731 |
+
_pending_status(pending_speaker),
|
| 2732 |
+
)
|
| 2733 |
except Exception as exc:
|
| 2734 |
yield (
|
| 2735 |
render_court(events, started=True),
|
|
|
|
| 2740 |
)
|
| 2741 |
return
|
| 2742 |
yield (
|
| 2743 |
+
render_court(events, started=True, show_verdict_popup=True),
|
| 2744 |
render_evidence(events),
|
| 2745 |
render_jurors(events),
|
| 2746 |
render_mind(events, mind_layer),
|
|
|
|
| 2761 |
)
|
| 2762 |
start = gr.Button("Begin Trial", variant="primary", scale=1)
|
| 2763 |
status = gr.Markdown("Ready.", elem_classes=["book-status"])
|
| 2764 |
+
courtroom = gr.HTML(render_case_preview("Trial of Socrates"), label="Live courtroom")
|
| 2765 |
search = gr.State("")
|
| 2766 |
+
hypo = gr.State("")
|
| 2767 |
+
custom_payload = gr.State("")
|
| 2768 |
speed = gr.State("swift")
|
| 2769 |
mind = gr.State(True)
|
|
|
|
|
|
|
|
|
|
| 2770 |
with gr.Row(elem_classes=["drawer-shell"]):
|
| 2771 |
with gr.Column(scale=1):
|
| 2772 |
with gr.Tab("Evidence Drawer"):
|
|
|
|
| 2774 |
with gr.Tab("Juror Panel"):
|
| 2775 |
jurors = gr.HTML(render_jurors([]))
|
| 2776 |
mind_html = gr.HTML(render_mind([], True), visible=False)
|
| 2777 |
+
case.change(
|
| 2778 |
+
render_case_preview,
|
| 2779 |
+
inputs=[case],
|
| 2780 |
+
outputs=[courtroom],
|
| 2781 |
+
)
|
| 2782 |
start.click(
|
| 2783 |
run_ui,
|
| 2784 |
+
inputs=[case, search, hypo, custom_payload, speed, mind],
|
| 2785 |
outputs=[courtroom, evidence, jurors, mind_html, status],
|
| 2786 |
js=START_JS,
|
| 2787 |
)
|
modal_app.py
CHANGED
|
@@ -3,7 +3,7 @@ import time
|
|
| 3 |
|
| 4 |
import modal
|
| 5 |
|
| 6 |
-
from sovereign_bench.engine import stream_trial_jsonl
|
| 7 |
from sovereign_bench.llm import (
|
| 8 |
ModelCall,
|
| 9 |
ModelResult,
|
|
@@ -12,10 +12,12 @@ from sovereign_bench.llm import (
|
|
| 12 |
)
|
| 13 |
from sovereign_bench.models import TrialRequest
|
| 14 |
|
| 15 |
-
|
|
|
|
| 16 |
GPU_NAME = "H100"
|
| 17 |
GPU_TIMEOUT_SECONDS = 20 * 60
|
| 18 |
HF_CACHE_DIR = "/root/.cache/huggingface"
|
|
|
|
| 19 |
|
| 20 |
image = (
|
| 21 |
modal.Image.debian_slim(python_version="3.12")
|
|
@@ -89,7 +91,8 @@ class VllmModel:
|
|
| 89 |
"role": "user",
|
| 90 |
"content": (
|
| 91 |
"Your previous response did not include visible courtroom dialogue. "
|
| 92 |
-
"Return only the final
|
|
|
|
| 93 |
),
|
| 94 |
}
|
| 95 |
]
|
|
@@ -115,6 +118,10 @@ class VllmModel:
|
|
| 115 |
"latency_ms": int((time.perf_counter() - started) * 1000),
|
| 116 |
}
|
| 117 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 118 |
|
| 119 |
def modal_gpu_enabled() -> bool:
|
| 120 |
return os.getenv("SOVEREIGN_DISABLE_MODAL_GPU", "").lower() not in {"1", "true", "yes"}
|
|
@@ -127,6 +134,9 @@ def modal_gpu_runner(**kwargs) -> ModelResult:
|
|
| 127 |
case_summary=kwargs["case_summary"],
|
| 128 |
task=kwargs["task"],
|
| 129 |
evidence_summary=kwargs["evidence_summary"],
|
|
|
|
|
|
|
|
|
|
| 130 |
)
|
| 131 |
requested_model = kwargs["model"]
|
| 132 |
prompt_hash = messages_hash(messages)
|
|
@@ -191,3 +201,12 @@ def trial_stream(payload: dict):
|
|
| 191 |
@app.local_entrypoint()
|
| 192 |
def main():
|
| 193 |
print(check_huggingface_connection.remote())
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
|
| 4 |
import modal
|
| 5 |
|
| 6 |
+
from sovereign_bench.engine import MODEL_BUDGET, stream_trial_jsonl
|
| 7 |
from sovereign_bench.llm import (
|
| 8 |
ModelCall,
|
| 9 |
ModelResult,
|
|
|
|
| 12 |
)
|
| 13 |
from sovereign_bench.models import TrialRequest
|
| 14 |
|
| 15 |
+
MODAL_APP_NAME = "sovereign-bench"
|
| 16 |
+
app = modal.App(MODAL_APP_NAME)
|
| 17 |
GPU_NAME = "H100"
|
| 18 |
GPU_TIMEOUT_SECONDS = 20 * 60
|
| 19 |
HF_CACHE_DIR = "/root/.cache/huggingface"
|
| 20 |
+
USED_MODEL_IDS = tuple(dict.fromkeys(model for _, model, _ in MODEL_BUDGET))
|
| 21 |
|
| 22 |
image = (
|
| 23 |
modal.Image.debian_slim(python_version="3.12")
|
|
|
|
| 91 |
"role": "user",
|
| 92 |
"content": (
|
| 93 |
"Your previous response did not include visible courtroom dialogue. "
|
| 94 |
+
"Return only the final answer now. Do not mention prompts, tasks, requirements, or that you are following instructions. "
|
| 95 |
+
"Do not include <think>, analysis, reasoning, markdown, narration, or notes. /no_think"
|
| 96 |
),
|
| 97 |
}
|
| 98 |
]
|
|
|
|
| 118 |
"latency_ms": int((time.perf_counter() - started) * 1000),
|
| 119 |
}
|
| 120 |
|
| 121 |
+
@modal.method()
|
| 122 |
+
def warm(self) -> dict:
|
| 123 |
+
return {"model": self.model_id, "status": "warm"}
|
| 124 |
+
|
| 125 |
|
| 126 |
def modal_gpu_enabled() -> bool:
|
| 127 |
return os.getenv("SOVEREIGN_DISABLE_MODAL_GPU", "").lower() not in {"1", "true", "yes"}
|
|
|
|
| 134 |
case_summary=kwargs["case_summary"],
|
| 135 |
task=kwargs["task"],
|
| 136 |
evidence_summary=kwargs["evidence_summary"],
|
| 137 |
+
trial_history=kwargs.get("trial_history", ""),
|
| 138 |
+
persona=kwargs.get("persona", ""),
|
| 139 |
+
objective=kwargs.get("objective", ""),
|
| 140 |
)
|
| 141 |
requested_model = kwargs["model"]
|
| 142 |
prompt_hash = messages_hash(messages)
|
|
|
|
| 201 |
@app.local_entrypoint()
|
| 202 |
def main():
|
| 203 |
print(check_huggingface_connection.remote())
|
| 204 |
+
|
| 205 |
+
|
| 206 |
+
@app.local_entrypoint()
|
| 207 |
+
def warm_models():
|
| 208 |
+
deployed_model = modal.Cls.from_name(MODAL_APP_NAME, "VllmModel")
|
| 209 |
+
for model_id in USED_MODEL_IDS:
|
| 210 |
+
model = deployed_model(model_id=model_id)
|
| 211 |
+
model.update_autoscaler(min_containers=1)
|
| 212 |
+
print(model.warm.remote())
|
sovereign_bench/cases.py
CHANGED
|
@@ -11,6 +11,11 @@ SOCRATES = CasePacket(
|
|
| 11 |
respondent="Socrates",
|
| 12 |
charge="Corrupting the youth and refusing the sanctioned gods of the city.",
|
| 13 |
setting="Athens, 399 BCE, reassembled inside a pocket tribunal.",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
claimant_claim=(
|
| 15 |
"The city argues that Socrates trained young citizens to mock public authority "
|
| 16 |
"and placed private daimonion guidance above civic religion."
|
|
@@ -25,19 +30,7 @@ SOCRATES = CasePacket(
|
|
| 25 |
),
|
| 26 |
evidence=[
|
| 27 |
EvidenceItem(
|
| 28 |
-
id="SOC-
|
| 29 |
-
title="The Oracle Burden",
|
| 30 |
-
source="Plato, Apology tradition",
|
| 31 |
-
excerpt=(
|
| 32 |
-
"Socrates describes testing reputedly wise citizens after a Delphic oracle "
|
| 33 |
-
"report, creating public embarrassment but framing the act as duty."
|
| 34 |
-
),
|
| 35 |
-
supports="mixed",
|
| 36 |
-
reliability=0.78,
|
| 37 |
-
note="Shows both civic irritation and a claimed religious motivation.",
|
| 38 |
-
),
|
| 39 |
-
EvidenceItem(
|
| 40 |
-
id="SOC-E2",
|
| 41 |
title="Youthful Imitators",
|
| 42 |
source="Plato, Apology tradition",
|
| 43 |
excerpt=(
|
|
@@ -49,7 +42,31 @@ SOCRATES = CasePacket(
|
|
| 49 |
note="Supports social effect, but does not prove intentional corruption.",
|
| 50 |
),
|
| 51 |
EvidenceItem(
|
| 52 |
-
id="SOC-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 53 |
title="No Fee, No School",
|
| 54 |
source="Ancient defense tradition",
|
| 55 |
excerpt=(
|
|
@@ -61,16 +78,127 @@ SOCRATES = CasePacket(
|
|
| 61 |
note="Weakens the claim that he operated a formal corrupting academy.",
|
| 62 |
),
|
| 63 |
EvidenceItem(
|
| 64 |
-
id="SOC-
|
| 65 |
-
title="
|
| 66 |
-
source="
|
| 67 |
excerpt=(
|
| 68 |
-
"Socrates
|
| 69 |
-
"
|
| 70 |
),
|
| 71 |
-
supports="
|
| 72 |
-
reliability=0.
|
| 73 |
-
note="
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
),
|
| 75 |
],
|
| 76 |
)
|
|
@@ -84,6 +212,11 @@ BARNABY = CasePacket(
|
|
| 84 |
respondent="Barnaby Buttons",
|
| 85 |
charge="Theft of the final mooncake and alteration of the communal snack ledger.",
|
| 86 |
setting="A fluorescent office kitchen at 4:47 p.m., under the humming republic of the fridge.",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 87 |
claimant_claim=(
|
| 88 |
"Barnaby removed the final mooncake, changed the snack ledger from '1 mooncake' "
|
| 89 |
"to '0 mooncakes', and left the team dessertless."
|
|
@@ -92,7 +225,7 @@ BARNABY = CasePacket(
|
|
| 92 |
"Barnaby says the mooncake was already abandoned, the ledger pen skipped naturally, "
|
| 93 |
"and the crumbs came from an unrelated biscuit."
|
| 94 |
),
|
| 95 |
-
source_note="Cached original whimsical packet
|
| 96 |
evidence=[
|
| 97 |
EvidenceItem(
|
| 98 |
id="BTN-E1",
|
|
@@ -134,7 +267,7 @@ BARNABY = CasePacket(
|
|
| 134 |
)
|
| 135 |
|
| 136 |
|
| 137 |
-
CASES = {case.id: case for case in (SOCRATES, BARNABY)}
|
| 138 |
|
| 139 |
|
| 140 |
def get_case(case_id: str) -> CasePacket:
|
|
|
|
| 11 |
respondent="Socrates",
|
| 12 |
charge="Corrupting the youth and refusing the sanctioned gods of the city.",
|
| 13 |
setting="Athens, 399 BCE, reassembled inside a pocket tribunal.",
|
| 14 |
+
context=(
|
| 15 |
+
"Athens has brought Socrates back before a civic court after years of public questioning, "
|
| 16 |
+
"youthful imitators, and anxiety about private religious claims. The city says his method "
|
| 17 |
+
"weakened civic order; Socrates says he served the public by exposing false wisdom."
|
| 18 |
+
),
|
| 19 |
claimant_claim=(
|
| 20 |
"The city argues that Socrates trained young citizens to mock public authority "
|
| 21 |
"and placed private daimonion guidance above civic religion."
|
|
|
|
| 30 |
),
|
| 31 |
evidence=[
|
| 32 |
EvidenceItem(
|
| 33 |
+
id="SOC-F1",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
title="Youthful Imitators",
|
| 35 |
source="Plato, Apology tradition",
|
| 36 |
excerpt=(
|
|
|
|
| 42 |
note="Supports social effect, but does not prove intentional corruption.",
|
| 43 |
),
|
| 44 |
EvidenceItem(
|
| 45 |
+
id="SOC-F2",
|
| 46 |
+
title="Public Embarrassment",
|
| 47 |
+
source="Ancient defense tradition",
|
| 48 |
+
excerpt=(
|
| 49 |
+
"Socrates describes testing reputedly wise citizens in public after hearing the "
|
| 50 |
+
"Delphic oracle report."
|
| 51 |
+
),
|
| 52 |
+
supports="claimant",
|
| 53 |
+
reliability=0.74,
|
| 54 |
+
note="Shows a repeated practice that made civic leaders look foolish.",
|
| 55 |
+
),
|
| 56 |
+
EvidenceItem(
|
| 57 |
+
id="SOC-F3",
|
| 58 |
+
title="The Daimonion Suspicion",
|
| 59 |
+
source="Ancient biographical tradition",
|
| 60 |
+
excerpt=(
|
| 61 |
+
"Socrates reports a private divine sign that restrains him from certain actions, "
|
| 62 |
+
"which civic accusers read as religious irregularity."
|
| 63 |
+
),
|
| 64 |
+
supports="claimant",
|
| 65 |
+
reliability=0.64,
|
| 66 |
+
note="Supports the impiety theory if private revelation is treated as civic defiance.",
|
| 67 |
+
),
|
| 68 |
+
EvidenceItem(
|
| 69 |
+
id="SOC-A1",
|
| 70 |
title="No Fee, No School",
|
| 71 |
source="Ancient defense tradition",
|
| 72 |
excerpt=(
|
|
|
|
| 78 |
note="Weakens the claim that he operated a formal corrupting academy.",
|
| 79 |
),
|
| 80 |
EvidenceItem(
|
| 81 |
+
id="SOC-A2",
|
| 82 |
+
title="Oracle as Duty",
|
| 83 |
+
source="Plato, Apology tradition",
|
| 84 |
excerpt=(
|
| 85 |
+
"Socrates frames his questioning as obedience to a divine puzzle rather than "
|
| 86 |
+
"contempt for religion."
|
| 87 |
),
|
| 88 |
+
supports="respondent",
|
| 89 |
+
reliability=0.78,
|
| 90 |
+
note="Turns the impiety charge into a competing account of piety.",
|
| 91 |
+
),
|
| 92 |
+
EvidenceItem(
|
| 93 |
+
id="SOC-A3",
|
| 94 |
+
title="Cross-Examination as Service",
|
| 95 |
+
source="Defense summary",
|
| 96 |
+
excerpt=(
|
| 97 |
+
"The defense treats uncomfortable questioning as civic improvement, not sabotage "
|
| 98 |
+
"or intentional corruption."
|
| 99 |
+
),
|
| 100 |
+
supports="respondent",
|
| 101 |
+
reliability=0.7,
|
| 102 |
+
note="Gives the jury a public-interest reason to tolerate Socrates.",
|
| 103 |
+
),
|
| 104 |
+
],
|
| 105 |
+
)
|
| 106 |
+
|
| 107 |
+
|
| 108 |
+
GREG = CasePacket(
|
| 109 |
+
id="greg",
|
| 110 |
+
title="Greg Heffley v. Mom",
|
| 111 |
+
subtitle="A family-court argument over a diary, embarrassment, and parental good intentions.",
|
| 112 |
+
claimant="Greg Heffley",
|
| 113 |
+
respondent="Susan Heffley",
|
| 114 |
+
charge="Whether Mom wrongfully saddled Greg with an embarrassing diary instead of a normal journal.",
|
| 115 |
+
setting="The Heffley house on the eve of another middle-school year.",
|
| 116 |
+
context=(
|
| 117 |
+
"Greg receives a book from his mom meant to help him record his thoughts, but he objects "
|
| 118 |
+
"that the word diary makes him look childish and vulnerable at school. Mom treats the book "
|
| 119 |
+
"as a harmless tool for reflection; Greg treats it as social evidence waiting to be used "
|
| 120 |
+
"against him."
|
| 121 |
+
),
|
| 122 |
+
claimant_claim=(
|
| 123 |
+
"Greg argues that Mom ignored the obvious social risk of handing a middle-school boy a diary "
|
| 124 |
+
"and failed to respect how easily classmates can turn an object into humiliation."
|
| 125 |
+
),
|
| 126 |
+
respondent_claim=(
|
| 127 |
+
"Mom answers that the writing book is a constructive outlet, that Greg can choose how to use it, "
|
| 128 |
+
"and that parental encouragement is not social sabotage."
|
| 129 |
+
),
|
| 130 |
+
source_note=(
|
| 131 |
+
"Cached demo packet using paraphrased context from the Diary of a Wimpy Kid setup. "
|
| 132 |
+
"No book text is quoted."
|
| 133 |
+
),
|
| 134 |
+
evidence=[
|
| 135 |
+
EvidenceItem(
|
| 136 |
+
id="GRG-F1",
|
| 137 |
+
title="The Label Problem",
|
| 138 |
+
source="Greg's objection",
|
| 139 |
+
excerpt=(
|
| 140 |
+
"Greg objects that diary is the wrong label for a middle-school boy and could be "
|
| 141 |
+
"used to mock him."
|
| 142 |
+
),
|
| 143 |
+
supports="claimant",
|
| 144 |
+
reliability=0.74,
|
| 145 |
+
note="Shows a foreseeable embarrassment risk from Greg's perspective.",
|
| 146 |
+
),
|
| 147 |
+
EvidenceItem(
|
| 148 |
+
id="GRG-F2",
|
| 149 |
+
title="Middle-School Audience",
|
| 150 |
+
source="School context",
|
| 151 |
+
excerpt=(
|
| 152 |
+
"Greg's social world rewards status and punishes anything classmates can frame "
|
| 153 |
+
"as childish."
|
| 154 |
+
),
|
| 155 |
+
supports="claimant",
|
| 156 |
+
reliability=0.7,
|
| 157 |
+
note="Makes the harm plausible even before anyone finds the book.",
|
| 158 |
+
),
|
| 159 |
+
EvidenceItem(
|
| 160 |
+
id="GRG-F3",
|
| 161 |
+
title="Ignored Preference",
|
| 162 |
+
source="Family exchange summary",
|
| 163 |
+
excerpt=(
|
| 164 |
+
"Greg wanted distance from the diary framing, but Mom treated the gift as settled."
|
| 165 |
+
),
|
| 166 |
+
supports="claimant",
|
| 167 |
+
reliability=0.66,
|
| 168 |
+
note="Supports Greg's autonomy argument, though parents often choose school supplies.",
|
| 169 |
+
),
|
| 170 |
+
EvidenceItem(
|
| 171 |
+
id="GRG-A1",
|
| 172 |
+
title="Private Writing Tool",
|
| 173 |
+
source="Mom's purpose",
|
| 174 |
+
excerpt=(
|
| 175 |
+
"Mom intended the book as a private place for Greg to record his thoughts and school year."
|
| 176 |
+
),
|
| 177 |
+
supports="respondent",
|
| 178 |
+
reliability=0.78,
|
| 179 |
+
note="Shows a constructive parental purpose rather than intent to embarrass.",
|
| 180 |
+
),
|
| 181 |
+
EvidenceItem(
|
| 182 |
+
id="GRG-A2",
|
| 183 |
+
title="Greg Controls Disclosure",
|
| 184 |
+
source="Household facts",
|
| 185 |
+
excerpt=(
|
| 186 |
+
"The book is not inherently public; Greg can keep it private and decide what to write."
|
| 187 |
+
),
|
| 188 |
+
supports="respondent",
|
| 189 |
+
reliability=0.68,
|
| 190 |
+
note="Weakens the claim that the gift itself creates inevitable harm.",
|
| 191 |
+
),
|
| 192 |
+
EvidenceItem(
|
| 193 |
+
id="GRG-A3",
|
| 194 |
+
title="Reflection Has Value",
|
| 195 |
+
source="Parenting rationale",
|
| 196 |
+
excerpt=(
|
| 197 |
+
"A journal can help a student process school, family, and growing-up pressures."
|
| 198 |
+
),
|
| 199 |
+
supports="respondent",
|
| 200 |
+
reliability=0.71,
|
| 201 |
+
note="Gives Mom a reasonable-benefit argument even if the branding is awkward.",
|
| 202 |
),
|
| 203 |
],
|
| 204 |
)
|
|
|
|
| 212 |
respondent="Barnaby Buttons",
|
| 213 |
charge="Theft of the final mooncake and alteration of the communal snack ledger.",
|
| 214 |
setting="A fluorescent office kitchen at 4:47 p.m., under the humming republic of the fridge.",
|
| 215 |
+
context=(
|
| 216 |
+
"An office breakroom has lost its final mooncake after a suspicious ledger update and "
|
| 217 |
+
"a trail of crumbs. The commonwealth blames Barnaby Buttons; Barnaby says the evidence "
|
| 218 |
+
"is ordinary office mess and coincidence."
|
| 219 |
+
),
|
| 220 |
claimant_claim=(
|
| 221 |
"Barnaby removed the final mooncake, changed the snack ledger from '1 mooncake' "
|
| 222 |
"to '0 mooncakes', and left the team dessertless."
|
|
|
|
| 225 |
"Barnaby says the mooncake was already abandoned, the ledger pen skipped naturally, "
|
| 226 |
"and the crumbs came from an unrelated biscuit."
|
| 227 |
),
|
| 228 |
+
source_note="Cached original whimsical packet kept for compatibility with older tests.",
|
| 229 |
evidence=[
|
| 230 |
EvidenceItem(
|
| 231 |
id="BTN-E1",
|
|
|
|
| 267 |
)
|
| 268 |
|
| 269 |
|
| 270 |
+
CASES = {case.id: case for case in (SOCRATES, GREG, BARNABY)}
|
| 271 |
|
| 272 |
|
| 273 |
def get_case(case_id: str) -> CasePacket:
|
sovereign_bench/engine.py
CHANGED
|
@@ -9,7 +9,7 @@ from collections.abc import Callable, Iterable
|
|
| 9 |
from pydantic import ValidationError
|
| 10 |
|
| 11 |
from .cases import get_case
|
| 12 |
-
from .llm import ModelCall, ModelResult, call_small_model
|
| 13 |
from .models import AgentTurn, CasePacket, JurorVote, TrialEvent, TrialRequest, Verdict
|
| 14 |
from .retrieval import build_live_case
|
| 15 |
|
|
@@ -20,11 +20,11 @@ OPENAI_PROVIDER = "auto"
|
|
| 20 |
OPENBMB_PROVIDER = "featherless-ai"
|
| 21 |
NEMOTRON_PROVIDER = "featherless-ai"
|
| 22 |
|
| 23 |
-
MODEL_BUDGET = [
|
| 24 |
-
("Presiding Advocate", GPT_OSS_MODEL, 20.0),
|
| 25 |
-
("Clerk of Style", OPENBMB_MODEL, 4.0),
|
| 26 |
-
("
|
| 27 |
-
]
|
| 28 |
TOTAL_PARAMS_B = sum(item[2] for item in MODEL_BUDGET)
|
| 29 |
|
| 30 |
JUDGE_NAME = "Marcus Aurelius"
|
|
@@ -59,12 +59,14 @@ def _turn(agent: str, role: str, result: ModelResult, model: str, confidence: fl
|
|
| 59 |
)
|
| 60 |
|
| 61 |
|
| 62 |
-
def _case_summary(packet: CasePacket) -> str:
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
f"
|
| 66 |
-
f"
|
| 67 |
-
|
|
|
|
|
|
|
| 68 |
|
| 69 |
|
| 70 |
def _evidence_summary(packet: CasePacket) -> str:
|
|
@@ -78,8 +80,12 @@ def _call_trace(calls: list[ModelCall]) -> list[dict]:
|
|
| 78 |
return [call.__dict__ for call in calls]
|
| 79 |
|
| 80 |
|
| 81 |
-
def resolve_case(request: TrialRequest) -> tuple[CasePacket, dict]:
|
| 82 |
-
if request.case_id == "
|
|
|
|
|
|
|
|
|
|
|
|
|
| 83 |
packet = build_live_case(request.search_query, request.hypothetical)
|
| 84 |
if packet:
|
| 85 |
return packet, {"mode": "live"}
|
|
@@ -99,12 +105,16 @@ def _required_role(model_runner: ModelRunner | None, model_calls: list[ModelCall
|
|
| 99 |
except Exception as exc:
|
| 100 |
raise RequiredModelError(f"{kwargs.get('agent', 'Model')} unavailable: {exc}") from exc
|
| 101 |
model_calls.append(result.call)
|
| 102 |
-
if not result.call.ok:
|
| 103 |
-
error = result.call.error or "model call did not complete"
|
| 104 |
-
raise RequiredModelError(f"{kwargs.get('agent', 'Model')} unavailable: {error}")
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 108 |
|
| 109 |
|
| 110 |
def _trace(packet: CasePacket, source_trace: dict, model_calls: list[ModelCall]) -> dict:
|
|
@@ -119,7 +129,7 @@ def _trace(packet: CasePacket, source_trace: dict, model_calls: list[ModelCall])
|
|
| 119 |
}
|
| 120 |
|
| 121 |
|
| 122 |
-
def _emit(
|
| 123 |
packet: CasePacket,
|
| 124 |
source_trace: dict,
|
| 125 |
model_calls: list[ModelCall],
|
|
@@ -129,10 +139,47 @@ def _emit(
|
|
| 129 |
event.trace = _trace(packet, source_trace, model_calls)
|
| 130 |
if delay > 0:
|
| 131 |
time.sleep(delay)
|
| 132 |
-
return event
|
| 133 |
-
|
| 134 |
-
|
| 135 |
-
def
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136 |
stripped = text.strip()
|
| 137 |
if stripped.startswith("```"):
|
| 138 |
stripped = re.sub(r"^```(?:json)?\s*", "", stripped, flags=re.I)
|
|
@@ -146,41 +193,37 @@ def _extract_json(text: str) -> object:
|
|
| 146 |
return json.loads(match.group(1))
|
| 147 |
|
| 148 |
|
| 149 |
-
def
|
| 150 |
-
try:
|
| 151 |
-
data = _extract_json(result.text)
|
| 152 |
-
except json.JSONDecodeError as exc:
|
| 153 |
-
raise RequiredModelError(f"
|
| 154 |
-
|
| 155 |
-
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
if
|
| 159 |
-
raise RequiredModelError("
|
| 160 |
-
|
| 161 |
-
|
| 162 |
-
|
| 163 |
-
|
| 164 |
-
|
| 165 |
-
|
| 166 |
-
|
| 167 |
-
|
| 168 |
-
|
| 169 |
-
|
| 170 |
-
if
|
| 171 |
-
raise RequiredModelError("
|
| 172 |
-
|
| 173 |
-
|
| 174 |
-
|
| 175 |
-
|
| 176 |
-
|
| 177 |
-
|
| 178 |
-
|
| 179 |
-
|
| 180 |
-
return votes
|
| 181 |
-
|
| 182 |
-
|
| 183 |
-
def _majority_finding(votes: list[JurorVote]) -> str:
|
| 184 |
counts = Counter(vote.vote for vote in votes)
|
| 185 |
top = counts.most_common()
|
| 186 |
if not top:
|
|
@@ -227,15 +270,12 @@ def _verdict_from_votes(votes: list[JurorVote]) -> Verdict:
|
|
| 227 |
)
|
| 228 |
|
| 229 |
|
| 230 |
-
def
|
| 231 |
-
personas = "\n".join(f"- {name}: {persona}" for name, persona in JUROR_PERSONAS.items())
|
| 232 |
return (
|
| 233 |
-
"
|
| 234 |
-
|
| 235 |
-
"
|
| 236 |
-
"
|
| 237 |
-
"Vote through the named public-history worldview, not a generic juror role.\n"
|
| 238 |
-
f"{personas}"
|
| 239 |
)
|
| 240 |
|
| 241 |
|
|
@@ -249,10 +289,11 @@ def stream_trial(
|
|
| 249 |
model_runner: ModelRunner | None = None,
|
| 250 |
) -> Iterable[TrialEvent]:
|
| 251 |
packet, source_trace = resolve_case(request)
|
| 252 |
-
case_summary = _case_summary(packet)
|
| 253 |
-
evidence_summary = _evidence_summary(packet)
|
| 254 |
-
model_calls: list[ModelCall] = []
|
| 255 |
-
|
|
|
|
| 256 |
hypo_line = f"\n\nUser hypothetical admitted as a blue-ribbon sidebar: {hypo}" if hypo else ""
|
| 257 |
|
| 258 |
clerk = _required_role(
|
|
@@ -263,14 +304,15 @@ def stream_trial(
|
|
| 263 |
model=OPENBMB_MODEL,
|
| 264 |
case_summary=case_summary,
|
| 265 |
evidence_summary=evidence_summary,
|
| 266 |
-
task="Announce the case by name, identify the parties, and read the charge.",
|
| 267 |
provider=OPENBMB_PROVIDER,
|
| 268 |
max_tokens=110,
|
| 269 |
)
|
| 270 |
-
yield
|
| 271 |
-
|
| 272 |
-
|
| 273 |
-
|
|
|
|
| 274 |
TrialEvent(
|
| 275 |
phase="intake",
|
| 276 |
title="The Court Convenes",
|
|
@@ -289,17 +331,21 @@ def stream_trial(
|
|
| 289 |
model=GPT_OSS_MODEL,
|
| 290 |
case_summary=case_summary,
|
| 291 |
evidence_summary=evidence_summary,
|
|
|
|
|
|
|
|
|
|
| 292 |
task=(
|
| 293 |
f"As {JUDGE_NAME}, a Stoic courtroom judge guided by {JUDGE_PERSONA}, explain the proceeding "
|
| 294 |
-
"and the burden of proof in one or two disciplined sentences."
|
| 295 |
),
|
| 296 |
provider=OPENAI_PROVIDER,
|
| 297 |
max_tokens=110,
|
| 298 |
)
|
| 299 |
-
yield
|
| 300 |
-
|
| 301 |
-
|
| 302 |
-
|
|
|
|
| 303 |
TrialEvent(
|
| 304 |
phase="intake",
|
| 305 |
title="The Burden Is Set",
|
|
@@ -313,24 +359,27 @@ def stream_trial(
|
|
| 313 |
claimant_opening = _required_role(
|
| 314 |
model_runner,
|
| 315 |
model_calls,
|
| 316 |
-
agent="
|
| 317 |
role="claimant advocate",
|
| 318 |
model=GPT_OSS_MODEL,
|
| 319 |
-
case_summary=case_summary,
|
| 320 |
-
evidence_summary=evidence_summary,
|
| 321 |
-
|
|
|
|
|
|
|
| 322 |
provider=OPENAI_PROVIDER,
|
| 323 |
max_tokens=130,
|
| 324 |
)
|
| 325 |
-
yield
|
| 326 |
-
|
| 327 |
-
|
| 328 |
-
|
|
|
|
| 329 |
TrialEvent(
|
| 330 |
phase="claims",
|
| 331 |
title="Claimant Opening",
|
| 332 |
body=packet.claimant_claim,
|
| 333 |
-
turns=[_turn("
|
| 334 |
evidence=packet.evidence,
|
| 335 |
),
|
| 336 |
delay,
|
|
@@ -339,53 +388,45 @@ def stream_trial(
|
|
| 339 |
respondent_opening = _required_role(
|
| 340 |
model_runner,
|
| 341 |
model_calls,
|
| 342 |
-
agent="
|
| 343 |
role="respondent advocate",
|
| 344 |
model=GPT_OSS_MODEL,
|
| 345 |
-
case_summary=case_summary,
|
| 346 |
-
evidence_summary=evidence_summary,
|
| 347 |
-
|
|
|
|
|
|
|
| 348 |
provider=OPENAI_PROVIDER,
|
| 349 |
max_tokens=130,
|
| 350 |
)
|
| 351 |
-
yield
|
| 352 |
-
|
| 353 |
-
|
| 354 |
-
|
|
|
|
| 355 |
TrialEvent(
|
| 356 |
phase="opening",
|
| 357 |
title="Respondent Opening",
|
| 358 |
body=packet.respondent_claim,
|
| 359 |
-
turns=[_turn("
|
| 360 |
evidence=packet.evidence,
|
| 361 |
),
|
| 362 |
delay,
|
| 363 |
)
|
| 364 |
|
| 365 |
-
|
| 366 |
-
|
| 367 |
-
|
| 368 |
-
|
| 369 |
-
|
| 370 |
-
|
| 371 |
-
|
| 372 |
-
|
| 373 |
-
|
| 374 |
-
|
| 375 |
-
|
| 376 |
-
|
| 377 |
-
|
| 378 |
-
packet,
|
| 379 |
-
source_trace,
|
| 380 |
-
model_calls,
|
| 381 |
-
TrialEvent(
|
| 382 |
-
phase="evidence",
|
| 383 |
-
title="The Record Is Audited",
|
| 384 |
-
body="\n".join(f"{item.id}: {item.title} | reliability {item.reliability:.2f} | {item.note}" for item in packet.evidence),
|
| 385 |
-
turns=[_turn("Auditor Prism", "evidence auditor", auditor, NEMOTRON_MODEL, 0.86)],
|
| 386 |
-
evidence=packet.evidence,
|
| 387 |
-
),
|
| 388 |
-
delay,
|
| 389 |
)
|
| 390 |
|
| 391 |
judge_question = _required_role(
|
|
@@ -396,17 +437,21 @@ def stream_trial(
|
|
| 396 |
model=GPT_OSS_MODEL,
|
| 397 |
case_summary=case_summary,
|
| 398 |
evidence_summary=evidence_summary,
|
|
|
|
|
|
|
|
|
|
| 399 |
task=(
|
| 400 |
f"As {JUDGE_NAME}, ask one sharp hinge question that would change the outcome if answered. "
|
| 401 |
-
"Use Stoic restraint and public reason."
|
| 402 |
),
|
| 403 |
provider=OPENAI_PROVIDER,
|
| 404 |
max_tokens=100,
|
| 405 |
)
|
| 406 |
-
yield
|
| 407 |
-
|
| 408 |
-
|
| 409 |
-
|
|
|
|
| 410 |
TrialEvent(
|
| 411 |
phase="questions",
|
| 412 |
title="The Hinge Question",
|
|
@@ -420,24 +465,27 @@ def stream_trial(
|
|
| 420 |
claimant_answer = _required_role(
|
| 421 |
model_runner,
|
| 422 |
model_calls,
|
| 423 |
-
agent="
|
| 424 |
role="claimant advocate",
|
| 425 |
model=GPT_OSS_MODEL,
|
| 426 |
case_summary=case_summary,
|
| 427 |
evidence_summary=evidence_summary,
|
| 428 |
-
|
|
|
|
|
|
|
| 429 |
provider=OPENAI_PROVIDER,
|
| 430 |
max_tokens=130,
|
| 431 |
)
|
| 432 |
-
yield
|
| 433 |
-
|
| 434 |
-
|
| 435 |
-
|
|
|
|
| 436 |
TrialEvent(
|
| 437 |
phase="questions",
|
| 438 |
title="Claimant Answers the Bench",
|
| 439 |
body="The claimant answers the hinge question.",
|
| 440 |
-
turns=[_turn("
|
| 441 |
evidence=packet.evidence,
|
| 442 |
),
|
| 443 |
delay,
|
|
@@ -446,24 +494,27 @@ def stream_trial(
|
|
| 446 |
respondent_answer = _required_role(
|
| 447 |
model_runner,
|
| 448 |
model_calls,
|
| 449 |
-
agent="
|
| 450 |
role="respondent advocate",
|
| 451 |
model=GPT_OSS_MODEL,
|
| 452 |
case_summary=case_summary,
|
| 453 |
evidence_summary=evidence_summary,
|
| 454 |
-
|
|
|
|
|
|
|
| 455 |
provider=OPENAI_PROVIDER,
|
| 456 |
max_tokens=130,
|
| 457 |
)
|
| 458 |
-
yield
|
| 459 |
-
|
| 460 |
-
|
| 461 |
-
|
|
|
|
| 462 |
TrialEvent(
|
| 463 |
phase="questions",
|
| 464 |
title="Respondent Answers the Bench",
|
| 465 |
body="The respondent answers the hinge question.",
|
| 466 |
-
turns=[_turn("
|
| 467 |
evidence=packet.evidence,
|
| 468 |
),
|
| 469 |
delay,
|
|
@@ -474,17 +525,20 @@ def stream_trial(
|
|
| 474 |
model_calls,
|
| 475 |
agent="Nemotron Jury",
|
| 476 |
role="juror panel",
|
| 477 |
-
model=NEMOTRON_MODEL,
|
| 478 |
-
case_summary=case_summary,
|
| 479 |
-
evidence_summary=evidence_summary,
|
| 480 |
-
|
|
|
|
|
|
|
| 481 |
provider=NEMOTRON_PROVIDER,
|
| 482 |
max_tokens=100,
|
| 483 |
)
|
| 484 |
-
yield
|
| 485 |
-
|
| 486 |
-
|
| 487 |
-
|
|
|
|
| 488 |
TrialEvent(
|
| 489 |
phase="deliberation",
|
| 490 |
title="The Jury Retires",
|
|
@@ -495,29 +549,35 @@ def stream_trial(
|
|
| 495 |
delay,
|
| 496 |
)
|
| 497 |
|
| 498 |
-
|
| 499 |
-
|
| 500 |
-
|
| 501 |
-
|
| 502 |
-
|
| 503 |
-
|
| 504 |
-
|
| 505 |
-
|
| 506 |
-
|
| 507 |
-
|
| 508 |
-
|
| 509 |
-
|
| 510 |
-
|
| 511 |
-
|
| 512 |
-
|
| 513 |
-
|
| 514 |
-
|
| 515 |
-
|
| 516 |
-
)
|
| 517 |
-
|
| 518 |
-
|
| 519 |
-
|
| 520 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 521 |
TrialEvent(
|
| 522 |
phase="deliberation",
|
| 523 |
title=f"Juror {vote.juror} Votes",
|
|
@@ -538,18 +598,22 @@ def stream_trial(
|
|
| 538 |
model=GPT_OSS_MODEL,
|
| 539 |
case_summary=case_summary,
|
| 540 |
evidence_summary=evidence_summary,
|
|
|
|
|
|
|
|
|
|
| 541 |
task=(
|
| 542 |
f"As {JUDGE_NAME}, announce the final legal finding after the jury vote with Stoic restraint. "
|
| 543 |
f"Finding: {verdict.finding}. "
|
| 544 |
-
f"Jury rationale: {verdict.rationale} Remedy: {verdict.remedy}.
|
| 545 |
),
|
| 546 |
provider=OPENAI_PROVIDER,
|
| 547 |
max_tokens=160,
|
| 548 |
)
|
| 549 |
-
yield
|
| 550 |
-
|
| 551 |
-
|
| 552 |
-
|
|
|
|
| 553 |
TrialEvent(
|
| 554 |
phase="verdict",
|
| 555 |
title="The Court Announces Judgment",
|
|
|
|
| 9 |
from pydantic import ValidationError
|
| 10 |
|
| 11 |
from .cases import get_case
|
| 12 |
+
from .llm import ModelCall, ModelCallError, ModelResult, call_small_model, clean_model_text
|
| 13 |
from .models import AgentTurn, CasePacket, JurorVote, TrialEvent, TrialRequest, Verdict
|
| 14 |
from .retrieval import build_live_case
|
| 15 |
|
|
|
|
| 20 |
OPENBMB_PROVIDER = "featherless-ai"
|
| 21 |
NEMOTRON_PROVIDER = "featherless-ai"
|
| 22 |
|
| 23 |
+
MODEL_BUDGET = [
|
| 24 |
+
("Presiding Advocate", GPT_OSS_MODEL, 20.0),
|
| 25 |
+
("Clerk of Style", OPENBMB_MODEL, 4.0),
|
| 26 |
+
("Jury Ring", NEMOTRON_MODEL, 8.0),
|
| 27 |
+
]
|
| 28 |
TOTAL_PARAMS_B = sum(item[2] for item in MODEL_BUDGET)
|
| 29 |
|
| 30 |
JUDGE_NAME = "Marcus Aurelius"
|
|
|
|
| 59 |
)
|
| 60 |
|
| 61 |
|
| 62 |
+
def _case_summary(packet: CasePacket) -> str:
|
| 63 |
+
context = packet.context or packet.setting
|
| 64 |
+
return (
|
| 65 |
+
f"{packet.title}. Charge: {packet.charge}\n"
|
| 66 |
+
f"Context: {context}\n"
|
| 67 |
+
f"Claimant: {packet.claimant_claim}\n"
|
| 68 |
+
f"Respondent: {packet.respondent_claim}"
|
| 69 |
+
)
|
| 70 |
|
| 71 |
|
| 72 |
def _evidence_summary(packet: CasePacket) -> str:
|
|
|
|
| 80 |
return [call.__dict__ for call in calls]
|
| 81 |
|
| 82 |
|
| 83 |
+
def resolve_case(request: TrialRequest) -> tuple[CasePacket, dict]:
|
| 84 |
+
if request.case_id == "custom":
|
| 85 |
+
if request.custom_case is None:
|
| 86 |
+
raise RuntimeError("Custom case requires trial details and evidence before the court can begin.")
|
| 87 |
+
return request.custom_case, {"mode": "custom"}
|
| 88 |
+
if request.case_id == "live":
|
| 89 |
packet = build_live_case(request.search_query, request.hypothetical)
|
| 90 |
if packet:
|
| 91 |
return packet, {"mode": "live"}
|
|
|
|
| 105 |
except Exception as exc:
|
| 106 |
raise RequiredModelError(f"{kwargs.get('agent', 'Model')} unavailable: {exc}") from exc
|
| 107 |
model_calls.append(result.call)
|
| 108 |
+
if not result.call.ok:
|
| 109 |
+
error = result.call.error or "model call did not complete"
|
| 110 |
+
raise RequiredModelError(f"{kwargs.get('agent', 'Model')} unavailable: {error}")
|
| 111 |
+
try:
|
| 112 |
+
result.text = clean_model_text(result.text)
|
| 113 |
+
except ModelCallError as exc:
|
| 114 |
+
raise RequiredModelError(f"{kwargs.get('agent', 'Model')} returned non-dialogue output: {exc}") from exc
|
| 115 |
+
if not result.text.strip():
|
| 116 |
+
raise RequiredModelError(f"{kwargs.get('agent', 'Model')} returned an empty response.")
|
| 117 |
+
return result
|
| 118 |
|
| 119 |
|
| 120 |
def _trace(packet: CasePacket, source_trace: dict, model_calls: list[ModelCall]) -> dict:
|
|
|
|
| 129 |
}
|
| 130 |
|
| 131 |
|
| 132 |
+
def _emit(
|
| 133 |
packet: CasePacket,
|
| 134 |
source_trace: dict,
|
| 135 |
model_calls: list[ModelCall],
|
|
|
|
| 139 |
event.trace = _trace(packet, source_trace, model_calls)
|
| 140 |
if delay > 0:
|
| 141 |
time.sleep(delay)
|
| 142 |
+
return event
|
| 143 |
+
|
| 144 |
+
|
| 145 |
+
def _record_and_emit(
|
| 146 |
+
events: list[TrialEvent],
|
| 147 |
+
packet: CasePacket,
|
| 148 |
+
source_trace: dict,
|
| 149 |
+
model_calls: list[ModelCall],
|
| 150 |
+
event: TrialEvent,
|
| 151 |
+
delay: float,
|
| 152 |
+
) -> TrialEvent:
|
| 153 |
+
emitted = _emit(packet, source_trace, model_calls, event, delay)
|
| 154 |
+
events.append(emitted)
|
| 155 |
+
return emitted
|
| 156 |
+
|
| 157 |
+
|
| 158 |
+
def _compact(value: str, limit: int = 420) -> str:
|
| 159 |
+
text = " ".join(value.split())
|
| 160 |
+
return text if len(text) <= limit else text[: limit - 3].rstrip() + "..."
|
| 161 |
+
|
| 162 |
+
|
| 163 |
+
def _trial_history(events: list[TrialEvent]) -> str:
|
| 164 |
+
if not events:
|
| 165 |
+
return "No trial statements have been made yet."
|
| 166 |
+
lines = []
|
| 167 |
+
for index, event in enumerate(events, start=1):
|
| 168 |
+
if event.turns:
|
| 169 |
+
turn = event.turns[0]
|
| 170 |
+
lines.append(
|
| 171 |
+
f"{index}. {event.phase} / {event.title} - {turn.agent} ({turn.role}): {_compact(turn.content)}"
|
| 172 |
+
)
|
| 173 |
+
elif event.body:
|
| 174 |
+
lines.append(f"{index}. {event.phase} / {event.title}: {_compact(event.body)}")
|
| 175 |
+
for vote in event.votes:
|
| 176 |
+
lines.append(
|
| 177 |
+
f" Vote - {vote.juror}: {vote.vote}; reason: {_compact(vote.reason, 220)}; evidence: {', '.join(vote.evidence_ids)}"
|
| 178 |
+
)
|
| 179 |
+
return "\n".join(lines)
|
| 180 |
+
|
| 181 |
+
|
| 182 |
+
def _extract_json(text: str) -> object:
|
| 183 |
stripped = text.strip()
|
| 184 |
if stripped.startswith("```"):
|
| 185 |
stripped = re.sub(r"^```(?:json)?\s*", "", stripped, flags=re.I)
|
|
|
|
| 193 |
return json.loads(match.group(1))
|
| 194 |
|
| 195 |
|
| 196 |
+
def _parse_juror_vote(result: ModelResult, packet: CasePacket, juror: str) -> JurorVote:
|
| 197 |
+
try:
|
| 198 |
+
data = _extract_json(result.text)
|
| 199 |
+
except json.JSONDecodeError as exc:
|
| 200 |
+
raise RequiredModelError(f"{juror} returned invalid JSON: {exc.msg}") from exc
|
| 201 |
+
if isinstance(data, dict) and isinstance(data.get("votes"), list):
|
| 202 |
+
if len(data["votes"]) != 1:
|
| 203 |
+
raise RequiredModelError(f"{juror} must return exactly one vote.")
|
| 204 |
+
data = data["votes"][0]
|
| 205 |
+
if not isinstance(data, dict):
|
| 206 |
+
raise RequiredModelError(f"{juror} vote output must be a JSON object.")
|
| 207 |
+
|
| 208 |
+
try:
|
| 209 |
+
vote = JurorVote.model_validate(data)
|
| 210 |
+
except ValidationError as exc:
|
| 211 |
+
raise RequiredModelError(f"{juror} vote schema is invalid: {exc.errors()[0]['msg']}") from exc
|
| 212 |
+
|
| 213 |
+
known_evidence = {item.id for item in packet.evidence}
|
| 214 |
+
expected_persona = JUROR_PERSONAS[juror]
|
| 215 |
+
if vote.juror != juror:
|
| 216 |
+
raise RequiredModelError(f"{juror} vote must use juror '{juror}'.")
|
| 217 |
+
if vote.persona.strip().lower() != expected_persona:
|
| 218 |
+
raise RequiredModelError(f"{juror} persona must be '{expected_persona}'.")
|
| 219 |
+
if not vote.reason.strip():
|
| 220 |
+
raise RequiredModelError(f"{juror} must include a rationale.")
|
| 221 |
+
if not vote.evidence_ids or any(evidence_id not in known_evidence for evidence_id in vote.evidence_ids):
|
| 222 |
+
raise RequiredModelError(f"{juror} must cite known evidence IDs.")
|
| 223 |
+
return vote
|
| 224 |
+
|
| 225 |
+
|
| 226 |
+
def _majority_finding(votes: list[JurorVote]) -> str:
|
|
|
|
|
|
|
|
|
|
|
|
|
| 227 |
counts = Counter(vote.vote for vote in votes)
|
| 228 |
top = counts.most_common()
|
| 229 |
if not top:
|
|
|
|
| 270 |
)
|
| 271 |
|
| 272 |
|
| 273 |
+
def _juror_task(juror: str, persona: str) -> str:
|
|
|
|
| 274 |
return (
|
| 275 |
+
f"After watching the trial, vote as {juror}. Your worldview is: {persona}. "
|
| 276 |
+
"Return exactly one JSON object with keys juror, persona, vote, reason, and evidence_ids. "
|
| 277 |
+
"Valid vote values are liable, not_liable, uncertain. The persona value must exactly match your worldview. "
|
| 278 |
+
"The reason must be one concise sentence grounded in your beliefs and the record. Cite evidence IDs from the record."
|
|
|
|
|
|
|
| 279 |
)
|
| 280 |
|
| 281 |
|
|
|
|
| 289 |
model_runner: ModelRunner | None = None,
|
| 290 |
) -> Iterable[TrialEvent]:
|
| 291 |
packet, source_trace = resolve_case(request)
|
| 292 |
+
case_summary = _case_summary(packet)
|
| 293 |
+
evidence_summary = _evidence_summary(packet)
|
| 294 |
+
model_calls: list[ModelCall] = []
|
| 295 |
+
events: list[TrialEvent] = []
|
| 296 |
+
hypo = request.hypothetical.strip()
|
| 297 |
hypo_line = f"\n\nUser hypothetical admitted as a blue-ribbon sidebar: {hypo}" if hypo else ""
|
| 298 |
|
| 299 |
clerk = _required_role(
|
|
|
|
| 304 |
model=OPENBMB_MODEL,
|
| 305 |
case_summary=case_summary,
|
| 306 |
evidence_summary=evidence_summary,
|
| 307 |
+
task="Begin with 'I call'. Announce the case by name, identify the parties, and read the charge.",
|
| 308 |
provider=OPENBMB_PROVIDER,
|
| 309 |
max_tokens=110,
|
| 310 |
)
|
| 311 |
+
yield _record_and_emit(
|
| 312 |
+
events,
|
| 313 |
+
packet,
|
| 314 |
+
source_trace,
|
| 315 |
+
model_calls,
|
| 316 |
TrialEvent(
|
| 317 |
phase="intake",
|
| 318 |
title="The Court Convenes",
|
|
|
|
| 331 |
model=GPT_OSS_MODEL,
|
| 332 |
case_summary=case_summary,
|
| 333 |
evidence_summary=evidence_summary,
|
| 334 |
+
trial_history=_trial_history(events),
|
| 335 |
+
persona=JUDGE_PERSONA,
|
| 336 |
+
objective="Set a fair standard for hearing both sides.",
|
| 337 |
task=(
|
| 338 |
f"As {JUDGE_NAME}, a Stoic courtroom judge guided by {JUDGE_PERSONA}, explain the proceeding "
|
| 339 |
+
"and the burden of proof in one or two disciplined sentences using I or we."
|
| 340 |
),
|
| 341 |
provider=OPENAI_PROVIDER,
|
| 342 |
max_tokens=110,
|
| 343 |
)
|
| 344 |
+
yield _record_and_emit(
|
| 345 |
+
events,
|
| 346 |
+
packet,
|
| 347 |
+
source_trace,
|
| 348 |
+
model_calls,
|
| 349 |
TrialEvent(
|
| 350 |
phase="intake",
|
| 351 |
title="The Burden Is Set",
|
|
|
|
| 359 |
claimant_opening = _required_role(
|
| 360 |
model_runner,
|
| 361 |
model_calls,
|
| 362 |
+
agent="Mike OSS",
|
| 363 |
role="claimant advocate",
|
| 364 |
model=GPT_OSS_MODEL,
|
| 365 |
+
case_summary=case_summary,
|
| 366 |
+
evidence_summary=evidence_summary,
|
| 367 |
+
trial_history=_trial_history(events),
|
| 368 |
+
objective="Win the case for the claimant using the strongest fair reading of the record.",
|
| 369 |
+
task="Make the claimant's opening statement alone, speaking as I for the claimant. Cite the strongest claimant-side exhibit.",
|
| 370 |
provider=OPENAI_PROVIDER,
|
| 371 |
max_tokens=130,
|
| 372 |
)
|
| 373 |
+
yield _record_and_emit(
|
| 374 |
+
events,
|
| 375 |
+
packet,
|
| 376 |
+
source_trace,
|
| 377 |
+
model_calls,
|
| 378 |
TrialEvent(
|
| 379 |
phase="claims",
|
| 380 |
title="Claimant Opening",
|
| 381 |
body=packet.claimant_claim,
|
| 382 |
+
turns=[_turn("Mike OSS", "claimant advocate", claimant_opening, GPT_OSS_MODEL, 0.88)],
|
| 383 |
evidence=packet.evidence,
|
| 384 |
),
|
| 385 |
delay,
|
|
|
|
| 388 |
respondent_opening = _required_role(
|
| 389 |
model_runner,
|
| 390 |
model_calls,
|
| 391 |
+
agent="Harvey Vector",
|
| 392 |
role="respondent advocate",
|
| 393 |
model=GPT_OSS_MODEL,
|
| 394 |
+
case_summary=case_summary,
|
| 395 |
+
evidence_summary=evidence_summary,
|
| 396 |
+
trial_history=_trial_history(events),
|
| 397 |
+
objective="Win the case for the respondent using doubt, context, and the strongest fair reading of the record.",
|
| 398 |
+
task="Make the respondent's opening statement alone, speaking as I for the respondent. Emphasize uncertainty and cite a helpful exhibit.",
|
| 399 |
provider=OPENAI_PROVIDER,
|
| 400 |
max_tokens=130,
|
| 401 |
)
|
| 402 |
+
yield _record_and_emit(
|
| 403 |
+
events,
|
| 404 |
+
packet,
|
| 405 |
+
source_trace,
|
| 406 |
+
model_calls,
|
| 407 |
TrialEvent(
|
| 408 |
phase="opening",
|
| 409 |
title="Respondent Opening",
|
| 410 |
body=packet.respondent_claim,
|
| 411 |
+
turns=[_turn("Harvey Vector", "respondent advocate", respondent_opening, GPT_OSS_MODEL, 0.88)],
|
| 412 |
evidence=packet.evidence,
|
| 413 |
),
|
| 414 |
delay,
|
| 415 |
)
|
| 416 |
|
| 417 |
+
yield _record_and_emit(
|
| 418 |
+
events,
|
| 419 |
+
packet,
|
| 420 |
+
source_trace,
|
| 421 |
+
model_calls,
|
| 422 |
+
TrialEvent(
|
| 423 |
+
phase="evidence",
|
| 424 |
+
title="The Evidence Record",
|
| 425 |
+
body="\n".join(f"{item.id}: {item.title} | reliability {item.reliability:.2f} | {item.note}" for item in packet.evidence),
|
| 426 |
+
turns=[],
|
| 427 |
+
evidence=packet.evidence,
|
| 428 |
+
),
|
| 429 |
+
delay,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 430 |
)
|
| 431 |
|
| 432 |
judge_question = _required_role(
|
|
|
|
| 437 |
model=GPT_OSS_MODEL,
|
| 438 |
case_summary=case_summary,
|
| 439 |
evidence_summary=evidence_summary,
|
| 440 |
+
trial_history=_trial_history(events),
|
| 441 |
+
persona=JUDGE_PERSONA,
|
| 442 |
+
objective="Ask the question most likely to reveal which side has met its burden.",
|
| 443 |
task=(
|
| 444 |
f"As {JUDGE_NAME}, ask one sharp hinge question that would change the outcome if answered. "
|
| 445 |
+
"Use Stoic restraint and public reason, speaking from the bench as I or we."
|
| 446 |
),
|
| 447 |
provider=OPENAI_PROVIDER,
|
| 448 |
max_tokens=100,
|
| 449 |
)
|
| 450 |
+
yield _record_and_emit(
|
| 451 |
+
events,
|
| 452 |
+
packet,
|
| 453 |
+
source_trace,
|
| 454 |
+
model_calls,
|
| 455 |
TrialEvent(
|
| 456 |
phase="questions",
|
| 457 |
title="The Hinge Question",
|
|
|
|
| 465 |
claimant_answer = _required_role(
|
| 466 |
model_runner,
|
| 467 |
model_calls,
|
| 468 |
+
agent="Mike OSS",
|
| 469 |
role="claimant advocate",
|
| 470 |
model=GPT_OSS_MODEL,
|
| 471 |
case_summary=case_summary,
|
| 472 |
evidence_summary=evidence_summary,
|
| 473 |
+
trial_history=_trial_history(events),
|
| 474 |
+
objective="Answer the judge in the way most favorable to the claimant.",
|
| 475 |
+
task=f"Answer {JUDGE_NAME}'s hinge question as I for the claimant: {judge_question.text}",
|
| 476 |
provider=OPENAI_PROVIDER,
|
| 477 |
max_tokens=130,
|
| 478 |
)
|
| 479 |
+
yield _record_and_emit(
|
| 480 |
+
events,
|
| 481 |
+
packet,
|
| 482 |
+
source_trace,
|
| 483 |
+
model_calls,
|
| 484 |
TrialEvent(
|
| 485 |
phase="questions",
|
| 486 |
title="Claimant Answers the Bench",
|
| 487 |
body="The claimant answers the hinge question.",
|
| 488 |
+
turns=[_turn("Mike OSS", "claimant advocate", claimant_answer, GPT_OSS_MODEL, 0.88)],
|
| 489 |
evidence=packet.evidence,
|
| 490 |
),
|
| 491 |
delay,
|
|
|
|
| 494 |
respondent_answer = _required_role(
|
| 495 |
model_runner,
|
| 496 |
model_calls,
|
| 497 |
+
agent="Harvey Vector",
|
| 498 |
role="respondent advocate",
|
| 499 |
model=GPT_OSS_MODEL,
|
| 500 |
case_summary=case_summary,
|
| 501 |
evidence_summary=evidence_summary,
|
| 502 |
+
trial_history=_trial_history(events),
|
| 503 |
+
objective="Answer the judge in the way most favorable to the respondent.",
|
| 504 |
+
task=f"Answer {JUDGE_NAME}'s hinge question as I for the respondent: {judge_question.text}",
|
| 505 |
provider=OPENAI_PROVIDER,
|
| 506 |
max_tokens=130,
|
| 507 |
)
|
| 508 |
+
yield _record_and_emit(
|
| 509 |
+
events,
|
| 510 |
+
packet,
|
| 511 |
+
source_trace,
|
| 512 |
+
model_calls,
|
| 513 |
TrialEvent(
|
| 514 |
phase="questions",
|
| 515 |
title="Respondent Answers the Bench",
|
| 516 |
body="The respondent answers the hinge question.",
|
| 517 |
+
turns=[_turn("Harvey Vector", "respondent advocate", respondent_answer, GPT_OSS_MODEL, 0.88)],
|
| 518 |
evidence=packet.evidence,
|
| 519 |
),
|
| 520 |
delay,
|
|
|
|
| 525 |
model_calls,
|
| 526 |
agent="Nemotron Jury",
|
| 527 |
role="juror panel",
|
| 528 |
+
model=NEMOTRON_MODEL,
|
| 529 |
+
case_summary=case_summary,
|
| 530 |
+
evidence_summary=evidence_summary,
|
| 531 |
+
trial_history=_trial_history(events),
|
| 532 |
+
objective="Move the court from arguments into individual jury votes.",
|
| 533 |
+
task="Announce as we, the six named jurors, that we retire to vote. Do not reveal the votes yet.",
|
| 534 |
provider=NEMOTRON_PROVIDER,
|
| 535 |
max_tokens=100,
|
| 536 |
)
|
| 537 |
+
yield _record_and_emit(
|
| 538 |
+
events,
|
| 539 |
+
packet,
|
| 540 |
+
source_trace,
|
| 541 |
+
model_calls,
|
| 542 |
TrialEvent(
|
| 543 |
phase="deliberation",
|
| 544 |
title="The Jury Retires",
|
|
|
|
| 549 |
delay,
|
| 550 |
)
|
| 551 |
|
| 552 |
+
votes: list[JurorVote] = []
|
| 553 |
+
for juror, persona in JUROR_PERSONAS.items():
|
| 554 |
+
juror_vote_result = _required_role(
|
| 555 |
+
model_runner,
|
| 556 |
+
model_calls,
|
| 557 |
+
agent=juror,
|
| 558 |
+
role="juror",
|
| 559 |
+
model=NEMOTRON_MODEL,
|
| 560 |
+
case_summary=case_summary,
|
| 561 |
+
evidence_summary=evidence_summary,
|
| 562 |
+
trial_history=_trial_history(events),
|
| 563 |
+
persona=persona,
|
| 564 |
+
objective="Reach the verdict this historical worldview would consider right after watching the trial.",
|
| 565 |
+
task=_juror_task(juror, persona),
|
| 566 |
+
provider=NEMOTRON_PROVIDER,
|
| 567 |
+
max_tokens=220,
|
| 568 |
+
)
|
| 569 |
+
vote = _parse_juror_vote(juror_vote_result, packet, juror)
|
| 570 |
+
votes.append(vote)
|
| 571 |
+
juror_result = ModelResult(
|
| 572 |
+
text=f"I vote {vote.vote.replace('_', ' ').title()}. {vote.reason}",
|
| 573 |
+
call=juror_vote_result.call,
|
| 574 |
+
input_text=juror_vote_result.input_text,
|
| 575 |
+
)
|
| 576 |
+
yield _record_and_emit(
|
| 577 |
+
events,
|
| 578 |
+
packet,
|
| 579 |
+
source_trace,
|
| 580 |
+
model_calls,
|
| 581 |
TrialEvent(
|
| 582 |
phase="deliberation",
|
| 583 |
title=f"Juror {vote.juror} Votes",
|
|
|
|
| 598 |
model=GPT_OSS_MODEL,
|
| 599 |
case_summary=case_summary,
|
| 600 |
evidence_summary=evidence_summary,
|
| 601 |
+
trial_history=_trial_history(events),
|
| 602 |
+
persona=JUDGE_PERSONA,
|
| 603 |
+
objective="Announce the jury result fairly, summarize both sides, and do not override the jury.",
|
| 604 |
task=(
|
| 605 |
f"As {JUDGE_NAME}, announce the final legal finding after the jury vote with Stoic restraint. "
|
| 606 |
f"Finding: {verdict.finding}. "
|
| 607 |
+
f"Jury rationale: {verdict.rationale} Remedy: {verdict.remedy}. Speak as I from the bench and include uncertainty without disclaiming the role."
|
| 608 |
),
|
| 609 |
provider=OPENAI_PROVIDER,
|
| 610 |
max_tokens=160,
|
| 611 |
)
|
| 612 |
+
yield _record_and_emit(
|
| 613 |
+
events,
|
| 614 |
+
packet,
|
| 615 |
+
source_trace,
|
| 616 |
+
model_calls,
|
| 617 |
TrialEvent(
|
| 618 |
phase="verdict",
|
| 619 |
title="The Court Announces Judgment",
|
sovereign_bench/llm.py
CHANGED
|
@@ -69,6 +69,21 @@ def _response_text(response: object) -> str:
|
|
| 69 |
return ""
|
| 70 |
|
| 71 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
def clean_model_text(text: str) -> str:
|
| 73 |
cleaned = re.sub(r"(?is)<think>.*?</think>", "", text).strip()
|
| 74 |
if re.search(r"(?i)<think>", cleaned):
|
|
@@ -76,6 +91,26 @@ def clean_model_text(text: str) -> str:
|
|
| 76 |
cleaned = re.sub(r"(?is)<analysis>.*?</analysis>", "", cleaned).strip()
|
| 77 |
cleaned = re.sub(r"(?is)<reasoning>.*?</reasoning>", "", cleaned).strip()
|
| 78 |
cleaned = cleaned.replace("</think>", "").strip()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 79 |
if not cleaned:
|
| 80 |
raise ModelCallError("model returned no visible output")
|
| 81 |
return cleaned
|
|
@@ -108,7 +143,9 @@ def call_hf_chat_model(
|
|
| 108 |
"role": "user",
|
| 109 |
"content": (
|
| 110 |
"Your previous response did not include visible courtroom dialogue. "
|
| 111 |
-
"Return only the final spoken dialogue now
|
|
|
|
|
|
|
| 112 |
),
|
| 113 |
}
|
| 114 |
]
|
|
@@ -166,6 +203,9 @@ def call_small_model(
|
|
| 166 |
case_summary: str,
|
| 167 |
task: str,
|
| 168 |
evidence_summary: str,
|
|
|
|
|
|
|
|
|
|
| 169 |
provider: str = "auto",
|
| 170 |
max_tokens: int = 120,
|
| 171 |
) -> ModelResult:
|
|
@@ -175,6 +215,9 @@ def call_small_model(
|
|
| 175 |
case_summary=case_summary,
|
| 176 |
task=task,
|
| 177 |
evidence_summary=evidence_summary,
|
|
|
|
|
|
|
|
|
|
| 178 |
)
|
| 179 |
result = call_hf_chat_model(
|
| 180 |
model=model,
|
|
@@ -193,17 +236,61 @@ def build_role_messages(
|
|
| 193 |
case_summary: str,
|
| 194 |
task: str,
|
| 195 |
evidence_summary: str,
|
|
|
|
|
|
|
|
|
|
| 196 |
) -> list[dict[str, str]]:
|
|
|
|
|
|
|
| 197 |
system = (
|
| 198 |
"You are one AI character in Sovereign Bench, a miniature virtual courtroom. "
|
| 199 |
-
"
|
|
|
|
| 200 |
"Do not claim certainty beyond the record. Do not add markdown. "
|
| 201 |
-
"
|
| 202 |
"Do not use thinking mode."
|
| 203 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 204 |
user = (
|
| 205 |
f"Agent: {agent}\nRole: {role}\nCase:\n{case_summary}\n\n"
|
| 206 |
-
f"Evidence:\n{evidence_summary}\n
|
| 207 |
-
"
|
|
|
|
| 208 |
)
|
| 209 |
return [{"role": "system", "content": system}, {"role": "user", "content": user}]
|
|
|
|
| 69 |
return ""
|
| 70 |
|
| 71 |
|
| 72 |
+
INSTRUCTION_ECHO_RE = re.compile(
|
| 73 |
+
r"(?is)\b("
|
| 74 |
+
r"as requested|"
|
| 75 |
+
r"first[- ]person|"
|
| 76 |
+
r"pronoun|"
|
| 77 |
+
r"1\s*-\s*3 sentences|"
|
| 78 |
+
r"theatrical but clear|"
|
| 79 |
+
r"i will speak as|"
|
| 80 |
+
r"i will now (?:announce|answer|respond|deliver|speak)|"
|
| 81 |
+
r"as the assigned agent|"
|
| 82 |
+
r"the task"
|
| 83 |
+
r")\b"
|
| 84 |
+
)
|
| 85 |
+
|
| 86 |
+
|
| 87 |
def clean_model_text(text: str) -> str:
|
| 88 |
cleaned = re.sub(r"(?is)<think>.*?</think>", "", text).strip()
|
| 89 |
if re.search(r"(?i)<think>", cleaned):
|
|
|
|
| 91 |
cleaned = re.sub(r"(?is)<analysis>.*?</analysis>", "", cleaned).strip()
|
| 92 |
cleaned = re.sub(r"(?is)<reasoning>.*?</reasoning>", "", cleaned).strip()
|
| 93 |
cleaned = cleaned.replace("</think>", "").strip()
|
| 94 |
+
channel_match = re.search(r"(?ims)^\s*(?:final|assistant_final)\s*:?\s*(.+)\Z", cleaned)
|
| 95 |
+
if channel_match:
|
| 96 |
+
cleaned = channel_match.group(1).strip()
|
| 97 |
+
else:
|
| 98 |
+
final_after_analysis = re.search(
|
| 99 |
+
r"(?ims)^\s*(?:analysis|reasoning|assistant_analysis)\s*:?.*?^\s*(?:final|assistant_final)\s*:?\s*(.+)\Z",
|
| 100 |
+
cleaned,
|
| 101 |
+
)
|
| 102 |
+
if final_after_analysis:
|
| 103 |
+
cleaned = final_after_analysis.group(1).strip()
|
| 104 |
+
elif re.search(r"(?im)^\s*(?:analysis|reasoning|assistant_analysis)\s*:?", cleaned):
|
| 105 |
+
raise ModelCallError("model returned hidden analysis instead of courtroom dialogue")
|
| 106 |
+
if re.search(r"(?i)\b(?:analysis|reasoning)\s*:", cleaned[:80]):
|
| 107 |
+
raise ModelCallError("model returned hidden analysis instead of courtroom dialogue")
|
| 108 |
+
if INSTRUCTION_ECHO_RE.search(cleaned[:420]):
|
| 109 |
+
pieces = [piece.strip() for piece in re.split(r"\n\s*\n", cleaned) if piece.strip()]
|
| 110 |
+
dialogue_pieces = [piece for piece in pieces if not INSTRUCTION_ECHO_RE.search(piece)]
|
| 111 |
+
if not dialogue_pieces:
|
| 112 |
+
raise ModelCallError("model echoed instructions instead of courtroom dialogue")
|
| 113 |
+
cleaned = "\n\n".join(dialogue_pieces).strip()
|
| 114 |
if not cleaned:
|
| 115 |
raise ModelCallError("model returned no visible output")
|
| 116 |
return cleaned
|
|
|
|
| 143 |
"role": "user",
|
| 144 |
"content": (
|
| 145 |
"Your previous response did not include visible courtroom dialogue. "
|
| 146 |
+
"Return only the final first-person spoken dialogue now, as the assigned agent. "
|
| 147 |
+
"Do not mention prompts, tasks, requirements, pronouns, sentence counts, or that you are following instructions. "
|
| 148 |
+
"Do not include <think>, analysis, reasoning, markdown, narration, or notes. /no_think"
|
| 149 |
),
|
| 150 |
}
|
| 151 |
]
|
|
|
|
| 203 |
case_summary: str,
|
| 204 |
task: str,
|
| 205 |
evidence_summary: str,
|
| 206 |
+
trial_history: str = "",
|
| 207 |
+
persona: str = "",
|
| 208 |
+
objective: str = "",
|
| 209 |
provider: str = "auto",
|
| 210 |
max_tokens: int = 120,
|
| 211 |
) -> ModelResult:
|
|
|
|
| 215 |
case_summary=case_summary,
|
| 216 |
task=task,
|
| 217 |
evidence_summary=evidence_summary,
|
| 218 |
+
trial_history=trial_history,
|
| 219 |
+
persona=persona,
|
| 220 |
+
objective=objective,
|
| 221 |
)
|
| 222 |
result = call_hf_chat_model(
|
| 223 |
model=model,
|
|
|
|
| 236 |
case_summary: str,
|
| 237 |
task: str,
|
| 238 |
evidence_summary: str,
|
| 239 |
+
trial_history: str = "",
|
| 240 |
+
persona: str = "",
|
| 241 |
+
objective: str = "",
|
| 242 |
) -> list[dict[str, str]]:
|
| 243 |
+
vote_role = role == "juror"
|
| 244 |
+
dialogue_role = not vote_role
|
| 245 |
system = (
|
| 246 |
"You are one AI character in Sovereign Bench, a miniature virtual courtroom. "
|
| 247 |
+
"Stay fully in character as the assigned Agent and Role. "
|
| 248 |
+
"Use the case facts and evidence provided below; cite evidence IDs when relevant. "
|
| 249 |
"Do not claim certainty beyond the record. Do not add markdown. "
|
| 250 |
+
"Never reveal hidden reasoning, analysis, or <think> text. "
|
| 251 |
"Do not use thinking mode."
|
| 252 |
)
|
| 253 |
+
if role in {"claimant advocate", "respondent advocate"}:
|
| 254 |
+
system += (
|
| 255 |
+
" You are a lawyer trying to win for your side. Use the evidence, the other side's claims, "
|
| 256 |
+
"and the trial record to make the strongest fair argument available."
|
| 257 |
+
)
|
| 258 |
+
elif role in {"judge", "verdict writer"}:
|
| 259 |
+
system += (
|
| 260 |
+
" You are a fair judge. Consider both sides, the evidence, and the trial record. "
|
| 261 |
+
"At verdict, announce and contextualize the jury result rather than replacing it with your own preferred outcome."
|
| 262 |
+
)
|
| 263 |
+
elif role == "juror":
|
| 264 |
+
system += (
|
| 265 |
+
" You are an individual juror. Decide through your named worldview and the trial transcript, "
|
| 266 |
+
"not a generic juror role. Output only valid JSON for your vote."
|
| 267 |
+
)
|
| 268 |
+
elif role == "juror panel":
|
| 269 |
+
system += " You speak for the jury panel procedurally; do not reveal votes before deliberation."
|
| 270 |
+
elif role == "clerk":
|
| 271 |
+
system += " You are a procedural courtroom role; present the record clearly without deciding the verdict."
|
| 272 |
+
|
| 273 |
+
if dialogue_role:
|
| 274 |
+
system += (
|
| 275 |
+
" Output only the words this character says aloud in court. "
|
| 276 |
+
"Use I, me, my, we, or our naturally when the role calls for it. "
|
| 277 |
+
"Do not narrate about yourself in the third person. Do not summarize what the agent would say."
|
| 278 |
+
)
|
| 279 |
+
answer_instruction = (
|
| 280 |
+
f"Speak as {agent}. Give only the in-scene court line, 1-3 concise sentences."
|
| 281 |
+
)
|
| 282 |
+
else:
|
| 283 |
+
answer_instruction = (
|
| 284 |
+
"Return only the requested JSON object. "
|
| 285 |
+
"Do not add dialogue, markdown, or commentary."
|
| 286 |
+
)
|
| 287 |
+
persona_block = f"\nPersona / worldview:\n{persona}\n" if persona else ""
|
| 288 |
+
objective_block = f"\nObjective:\n{objective}\n" if objective else ""
|
| 289 |
+
history_block = f"\nTrial history so far:\n{trial_history}\n" if trial_history else ""
|
| 290 |
user = (
|
| 291 |
f"Agent: {agent}\nRole: {role}\nCase:\n{case_summary}\n\n"
|
| 292 |
+
f"Evidence:\n{evidence_summary}\n"
|
| 293 |
+
f"{persona_block}{objective_block}{history_block}\nTask: {task}\n"
|
| 294 |
+
f"{answer_instruction}\n/no_think"
|
| 295 |
)
|
| 296 |
return [{"role": "system", "content": system}, {"role": "user", "content": user}]
|
sovereign_bench/models.py
CHANGED
|
@@ -35,6 +35,7 @@ class CasePacket(BaseModel):
|
|
| 35 |
respondent: str
|
| 36 |
charge: str
|
| 37 |
setting: str
|
|
|
|
| 38 |
claimant_claim: str
|
| 39 |
respondent_claim: str
|
| 40 |
source_note: str
|
|
@@ -45,6 +46,7 @@ class TrialRequest(BaseModel):
|
|
| 45 |
case_id: str = "socrates"
|
| 46 |
search_query: str = ""
|
| 47 |
hypothetical: str = ""
|
|
|
|
| 48 |
speed: Literal["swift", "measured", "ceremonial"] = "swift"
|
| 49 |
mind_layer: bool = True
|
| 50 |
|
|
|
|
| 35 |
respondent: str
|
| 36 |
charge: str
|
| 37 |
setting: str
|
| 38 |
+
context: str = ""
|
| 39 |
claimant_claim: str
|
| 40 |
respondent_claim: str
|
| 41 |
source_note: str
|
|
|
|
| 46 |
case_id: str = "socrates"
|
| 47 |
search_query: str = ""
|
| 48 |
hypothetical: str = ""
|
| 49 |
+
custom_case: CasePacket | None = None
|
| 50 |
speed: Literal["swift", "measured", "ceremonial"] = "swift"
|
| 51 |
mind_layer: bool = True
|
| 52 |
|
tests/test_cases.py
CHANGED
|
@@ -2,7 +2,15 @@ from sovereign_bench.cases import CASES
|
|
| 2 |
|
| 3 |
|
| 4 |
def test_cached_cases_have_evidence():
|
| 5 |
-
assert {"socrates", "barnaby"} <= set(CASES)
|
| 6 |
for case in CASES.values():
|
| 7 |
assert len(case.evidence) >= 4
|
| 8 |
assert all(item.id and item.excerpt for item in case.evidence)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
|
| 4 |
def test_cached_cases_have_evidence():
|
| 5 |
+
assert {"socrates", "greg", "barnaby"} <= set(CASES)
|
| 6 |
for case in CASES.values():
|
| 7 |
assert len(case.evidence) >= 4
|
| 8 |
assert all(item.id and item.excerpt for item in case.evidence)
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
def test_demo_cases_have_book_context_and_three_items_per_side():
|
| 12 |
+
for case_id in ["socrates", "greg"]:
|
| 13 |
+
case = CASES[case_id]
|
| 14 |
+
assert case.context
|
| 15 |
+
assert len([item for item in case.evidence if item.supports == "claimant"]) >= 3
|
| 16 |
+
assert len([item for item in case.evidence if item.supports == "respondent"]) >= 3
|
tests/test_engine.py
CHANGED
|
@@ -3,39 +3,36 @@ import re
|
|
| 3 |
|
| 4 |
import pytest
|
| 5 |
|
| 6 |
-
from sovereign_bench.engine import JUDGE_NAME, JUROR_PERSONAS, RequiredModelError, run_trial
|
| 7 |
-
from sovereign_bench.llm import ModelCall, ModelResult
|
| 8 |
-
from sovereign_bench.models import TrialRequest
|
| 9 |
|
| 10 |
|
| 11 |
-
def
|
| 12 |
-
evidence_ids = re.findall(r"^([A-Z]+-
|
| 13 |
-
|
| 14 |
return json.dumps(
|
| 15 |
{
|
| 16 |
-
"
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
"reason": f"{name} applies a {persona} lens to exhibit {evidence_ids[idx]}.",
|
| 22 |
-
"evidence_ids": [evidence_ids[idx]],
|
| 23 |
-
}
|
| 24 |
-
for idx, (name, persona) in enumerate(JUROR_PERSONAS.items())
|
| 25 |
-
]
|
| 26 |
}
|
| 27 |
)
|
| 28 |
|
| 29 |
|
| 30 |
def fake_model_runner(**kwargs):
|
| 31 |
text = (
|
| 32 |
-
|
| 33 |
-
if kwargs["role"] == "juror
|
| 34 |
else f"{kwargs['agent']} responds to: {kwargs['task']}"
|
| 35 |
)
|
| 36 |
prompt = (
|
| 37 |
f"SYSTEM:\nFake live model for tests.\n\nUSER:\n"
|
| 38 |
-
f"Agent: {kwargs['agent']}\nRole: {kwargs['role']}\
|
|
|
|
|
|
|
| 39 |
)
|
| 40 |
return ModelResult(
|
| 41 |
text=text,
|
|
@@ -54,12 +51,11 @@ def test_cached_cases_emit_sequential_speaker_order():
|
|
| 54 |
expected_speakers = [
|
| 55 |
"Clerk Meridian",
|
| 56 |
JUDGE_NAME,
|
| 57 |
-
"
|
| 58 |
-
"
|
| 59 |
-
"Auditor Prism",
|
| 60 |
JUDGE_NAME,
|
| 61 |
-
"
|
| 62 |
-
"
|
| 63 |
"Nemotron Jury",
|
| 64 |
*list(JUROR_PERSONAS),
|
| 65 |
JUDGE_NAME,
|
|
@@ -67,7 +63,10 @@ def test_cached_cases_emit_sequential_speaker_order():
|
|
| 67 |
for case_id in ["socrates", "barnaby"]:
|
| 68 |
events = run_trial(TrialRequest(case_id=case_id), model_runner=fake_model_runner)
|
| 69 |
|
| 70 |
-
assert [event.turns[0].agent for event in events] == expected_speakers
|
|
|
|
|
|
|
|
|
|
| 71 |
assert [event.phase for event in events].count("deliberation") == 7
|
| 72 |
assert events[0].turns[0].input
|
| 73 |
assert "SYSTEM:" in events[0].turns[0].input
|
|
@@ -81,12 +80,12 @@ def test_no_event_contains_both_lawyers_speaking_together():
|
|
| 81 |
|
| 82 |
for event in events:
|
| 83 |
agents = {turn.agent for turn in event.turns}
|
| 84 |
-
assert not {"
|
| 85 |
|
| 86 |
|
| 87 |
def test_juror_vote_events_have_fixed_personas_and_evidence():
|
| 88 |
events = run_trial(TrialRequest(case_id="socrates"), model_runner=fake_model_runner)
|
| 89 |
-
juror_events = [event for event in events if event.turns[0].agent in JUROR_PERSONAS]
|
| 90 |
|
| 91 |
assert len(juror_events) == 6
|
| 92 |
for event in juror_events:
|
|
@@ -94,6 +93,7 @@ def test_juror_vote_events_have_fixed_personas_and_evidence():
|
|
| 94 |
assert vote.juror == event.turns[0].agent
|
| 95 |
assert vote.persona == JUROR_PERSONAS[vote.juror]
|
| 96 |
assert vote.vote in {"liable", "not_liable", "uncertain"}
|
|
|
|
| 97 |
assert vote.reason
|
| 98 |
assert vote.evidence_ids
|
| 99 |
|
|
@@ -102,6 +102,95 @@ def test_juror_vote_events_have_fixed_personas_and_evidence():
|
|
| 102 |
assert [vote.juror for vote in final.votes] == list(JUROR_PERSONAS)
|
| 103 |
|
| 104 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 105 |
def test_jury_contract_uses_public_history_personas():
|
| 106 |
assert JUDGE_NAME == "Marcus Aurelius"
|
| 107 |
assert JUROR_PERSONAS == {
|
|
@@ -114,6 +203,94 @@ def test_jury_contract_uses_public_history_personas():
|
|
| 114 |
}
|
| 115 |
|
| 116 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 117 |
def test_required_model_failure_stops_trial_without_canned_dialogue():
|
| 118 |
def failing_runner(**kwargs):
|
| 119 |
return ModelResult(
|
|
@@ -136,7 +313,7 @@ def test_required_model_failure_stops_trial_without_canned_dialogue():
|
|
| 136 |
def test_invalid_jury_output_stops_trial_without_fallback_votes():
|
| 137 |
def invalid_jury_runner(**kwargs):
|
| 138 |
result = fake_model_runner(**kwargs)
|
| 139 |
-
if kwargs["role"] == "juror
|
| 140 |
result.text = "the jury refuses structured output"
|
| 141 |
return result
|
| 142 |
|
|
|
|
| 3 |
|
| 4 |
import pytest
|
| 5 |
|
| 6 |
+
from sovereign_bench.engine import JUDGE_NAME, JUROR_PERSONAS, RequiredModelError, run_trial, stream_trial
|
| 7 |
+
from sovereign_bench.llm import ModelCall, ModelResult, build_role_messages, clean_model_text
|
| 8 |
+
from sovereign_bench.models import CasePacket, EvidenceItem, TrialRequest
|
| 9 |
|
| 10 |
|
| 11 |
+
def _juror_json(kwargs, vote: str = "liable") -> str:
|
| 12 |
+
evidence_ids = re.findall(r"^([A-Z]+-[A-Z]\d+):", kwargs["evidence_summary"], flags=re.M)
|
| 13 |
+
evidence_id = (evidence_ids or ["SOC-E1"])[0]
|
| 14 |
return json.dumps(
|
| 15 |
{
|
| 16 |
+
"juror": kwargs["agent"],
|
| 17 |
+
"persona": kwargs["persona"],
|
| 18 |
+
"vote": vote,
|
| 19 |
+
"reason": f"{kwargs['agent']} applies {kwargs['persona']} to exhibit {evidence_id}.",
|
| 20 |
+
"evidence_ids": [evidence_id],
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
}
|
| 22 |
)
|
| 23 |
|
| 24 |
|
| 25 |
def fake_model_runner(**kwargs):
|
| 26 |
text = (
|
| 27 |
+
_juror_json(kwargs, vote="liable" if list(JUROR_PERSONAS).index(kwargs["agent"]) < 4 else "not_liable")
|
| 28 |
+
if kwargs["role"] == "juror"
|
| 29 |
else f"{kwargs['agent']} responds to: {kwargs['task']}"
|
| 30 |
)
|
| 31 |
prompt = (
|
| 32 |
f"SYSTEM:\nFake live model for tests.\n\nUSER:\n"
|
| 33 |
+
f"Agent: {kwargs['agent']}\nRole: {kwargs['role']}\n"
|
| 34 |
+
f"Persona: {kwargs.get('persona', '')}\nObjective: {kwargs.get('objective', '')}\n"
|
| 35 |
+
f"History: {kwargs.get('trial_history', '')}\nTask: {kwargs['task']}\n\nASSISTANT:\n"
|
| 36 |
)
|
| 37 |
return ModelResult(
|
| 38 |
text=text,
|
|
|
|
| 51 |
expected_speakers = [
|
| 52 |
"Clerk Meridian",
|
| 53 |
JUDGE_NAME,
|
| 54 |
+
"Mike OSS",
|
| 55 |
+
"Harvey Vector",
|
|
|
|
| 56 |
JUDGE_NAME,
|
| 57 |
+
"Mike OSS",
|
| 58 |
+
"Harvey Vector",
|
| 59 |
"Nemotron Jury",
|
| 60 |
*list(JUROR_PERSONAS),
|
| 61 |
JUDGE_NAME,
|
|
|
|
| 63 |
for case_id in ["socrates", "barnaby"]:
|
| 64 |
events = run_trial(TrialRequest(case_id=case_id), model_runner=fake_model_runner)
|
| 65 |
|
| 66 |
+
assert [event.turns[0].agent for event in events if event.turns] == expected_speakers
|
| 67 |
+
evidence_event = next(event for event in events if event.phase == "evidence")
|
| 68 |
+
assert evidence_event.title == "The Evidence Record"
|
| 69 |
+
assert evidence_event.turns == []
|
| 70 |
assert [event.phase for event in events].count("deliberation") == 7
|
| 71 |
assert events[0].turns[0].input
|
| 72 |
assert "SYSTEM:" in events[0].turns[0].input
|
|
|
|
| 80 |
|
| 81 |
for event in events:
|
| 82 |
agents = {turn.agent for turn in event.turns}
|
| 83 |
+
assert not {"Mike OSS", "Harvey Vector"}.issubset(agents)
|
| 84 |
|
| 85 |
|
| 86 |
def test_juror_vote_events_have_fixed_personas_and_evidence():
|
| 87 |
events = run_trial(TrialRequest(case_id="socrates"), model_runner=fake_model_runner)
|
| 88 |
+
juror_events = [event for event in events if event.turns and event.turns[0].agent in JUROR_PERSONAS]
|
| 89 |
|
| 90 |
assert len(juror_events) == 6
|
| 91 |
for event in juror_events:
|
|
|
|
| 93 |
assert vote.juror == event.turns[0].agent
|
| 94 |
assert vote.persona == JUROR_PERSONAS[vote.juror]
|
| 95 |
assert vote.vote in {"liable", "not_liable", "uncertain"}
|
| 96 |
+
assert event.turns[0].content.startswith("I vote ")
|
| 97 |
assert vote.reason
|
| 98 |
assert vote.evidence_ids
|
| 99 |
|
|
|
|
| 102 |
assert [vote.juror for vote in final.votes] == list(JUROR_PERSONAS)
|
| 103 |
|
| 104 |
|
| 105 |
+
def test_jurors_are_called_independently_with_personas_and_trial_history():
|
| 106 |
+
calls = []
|
| 107 |
+
|
| 108 |
+
def recording_runner(**kwargs):
|
| 109 |
+
calls.append(kwargs.copy())
|
| 110 |
+
return fake_model_runner(**kwargs)
|
| 111 |
+
|
| 112 |
+
run_trial(TrialRequest(case_id="socrates"), model_runner=recording_runner)
|
| 113 |
+
|
| 114 |
+
juror_calls = [call for call in calls if call["role"] == "juror"]
|
| 115 |
+
assert [call["agent"] for call in juror_calls] == list(JUROR_PERSONAS)
|
| 116 |
+
assert len(juror_calls) == 6
|
| 117 |
+
for call in juror_calls:
|
| 118 |
+
assert call["persona"] == JUROR_PERSONAS[call["agent"]]
|
| 119 |
+
assert "Claimant Opening" in call["trial_history"]
|
| 120 |
+
assert "Respondent Opening" in call["trial_history"]
|
| 121 |
+
assert "The Evidence Record" in call["trial_history"]
|
| 122 |
+
assert "historical worldview" in call["objective"]
|
| 123 |
+
|
| 124 |
+
|
| 125 |
+
def test_lawyers_and_judge_receive_trial_history_and_objectives():
|
| 126 |
+
calls = []
|
| 127 |
+
|
| 128 |
+
def recording_runner(**kwargs):
|
| 129 |
+
calls.append(kwargs.copy())
|
| 130 |
+
return fake_model_runner(**kwargs)
|
| 131 |
+
|
| 132 |
+
run_trial(TrialRequest(case_id="socrates"), model_runner=recording_runner)
|
| 133 |
+
|
| 134 |
+
claimant_answer = next(call for call in calls if call["agent"] == "Mike OSS" and "hinge question" in call["task"])
|
| 135 |
+
respondent_answer = next(call for call in calls if call["agent"] == "Harvey Vector" and "hinge question" in call["task"])
|
| 136 |
+
verdict_call = next(call for call in calls if call["role"] == "verdict writer")
|
| 137 |
+
|
| 138 |
+
assert "The Hinge Question" in claimant_answer["trial_history"]
|
| 139 |
+
assert "The Hinge Question" in respondent_answer["trial_history"]
|
| 140 |
+
assert "most favorable to the claimant" in claimant_answer["objective"]
|
| 141 |
+
assert "most favorable to the respondent" in respondent_answer["objective"]
|
| 142 |
+
assert all(name in verdict_call["trial_history"] for name in JUROR_PERSONAS)
|
| 143 |
+
assert "do not override the jury" in verdict_call["objective"]
|
| 144 |
+
|
| 145 |
+
|
| 146 |
+
def test_custom_case_context_and_evidence_reach_lawyer_prompts():
|
| 147 |
+
custom = CasePacket(
|
| 148 |
+
id="custom",
|
| 149 |
+
title="Custom Trial",
|
| 150 |
+
subtitle="Entered by user.",
|
| 151 |
+
claimant="Claimant",
|
| 152 |
+
respondent="Respondent",
|
| 153 |
+
charge="Whether the custom record favors the claimant.",
|
| 154 |
+
setting="A custom courtroom.",
|
| 155 |
+
context="A bicycle disappeared after a disputed garage visit.",
|
| 156 |
+
claimant_claim="The claimant says the visit explains the missing bicycle.",
|
| 157 |
+
respondent_claim="The respondent says the timing and evidence are ambiguous.",
|
| 158 |
+
source_note="Custom test packet.",
|
| 159 |
+
evidence=[
|
| 160 |
+
EvidenceItem(
|
| 161 |
+
id="CUS-F1",
|
| 162 |
+
title="Garage Text",
|
| 163 |
+
source="Custom",
|
| 164 |
+
excerpt="The respondent asked to enter the garage.",
|
| 165 |
+
supports="claimant",
|
| 166 |
+
reliability=0.65,
|
| 167 |
+
note="Supports access.",
|
| 168 |
+
),
|
| 169 |
+
EvidenceItem(
|
| 170 |
+
id="CUS-A1",
|
| 171 |
+
title="Neighbor Sighting",
|
| 172 |
+
source="Custom",
|
| 173 |
+
excerpt="A neighbor saw the bicycle later that day.",
|
| 174 |
+
supports="respondent",
|
| 175 |
+
reliability=0.65,
|
| 176 |
+
note="Supports alternative timing.",
|
| 177 |
+
),
|
| 178 |
+
],
|
| 179 |
+
)
|
| 180 |
+
calls = []
|
| 181 |
+
|
| 182 |
+
def recording_runner(**kwargs):
|
| 183 |
+
calls.append(kwargs.copy())
|
| 184 |
+
return fake_model_runner(**kwargs)
|
| 185 |
+
|
| 186 |
+
run_trial(TrialRequest(case_id="custom", custom_case=custom), model_runner=recording_runner)
|
| 187 |
+
|
| 188 |
+
claimant_opening = next(call for call in calls if call["agent"] == "Mike OSS" and call["role"] == "claimant advocate")
|
| 189 |
+
assert "A bicycle disappeared" in claimant_opening["case_summary"]
|
| 190 |
+
assert "CUS-F1" in claimant_opening["evidence_summary"]
|
| 191 |
+
assert "CUS-A1" in claimant_opening["evidence_summary"]
|
| 192 |
+
|
| 193 |
+
|
| 194 |
def test_jury_contract_uses_public_history_personas():
|
| 195 |
assert JUDGE_NAME == "Marcus Aurelius"
|
| 196 |
assert JUROR_PERSONAS == {
|
|
|
|
| 203 |
}
|
| 204 |
|
| 205 |
|
| 206 |
+
def test_role_prompt_requires_first_person_in_character_speech():
|
| 207 |
+
messages = build_role_messages(
|
| 208 |
+
agent="Harvey Vector",
|
| 209 |
+
role="respondent advocate",
|
| 210 |
+
case_summary="A short case summary.",
|
| 211 |
+
evidence_summary="SOC-E1: A record excerpt.",
|
| 212 |
+
task="Answer the bench for the respondent.",
|
| 213 |
+
)
|
| 214 |
+
|
| 215 |
+
system = messages[0]["content"]
|
| 216 |
+
user = messages[1]["content"]
|
| 217 |
+
|
| 218 |
+
assert "Stay fully in character as the assigned Agent and Role." in system
|
| 219 |
+
assert "Output only the words this character says aloud in court." in system
|
| 220 |
+
assert "Do not narrate about yourself in the third person." in system
|
| 221 |
+
assert "Use the case facts and evidence provided below" in system
|
| 222 |
+
assert "Speak as Harvey Vector." in user
|
| 223 |
+
assert "Give only the in-scene court line" in user
|
| 224 |
+
assert "SOC-E1" in user
|
| 225 |
+
|
| 226 |
+
|
| 227 |
+
def test_juror_vote_prompt_uses_persona_history_and_json_contract():
|
| 228 |
+
messages = build_role_messages(
|
| 229 |
+
agent="Karl Marx",
|
| 230 |
+
role="juror",
|
| 231 |
+
case_summary="A short case summary.",
|
| 232 |
+
evidence_summary="SOC-E1: A record excerpt.",
|
| 233 |
+
trial_history="Mike OSS argued from SOC-E1.",
|
| 234 |
+
persona=JUROR_PERSONAS["Karl Marx"],
|
| 235 |
+
objective="Vote as Karl Marx would after watching the trial.",
|
| 236 |
+
task="Return one juror vote as JSON.",
|
| 237 |
+
)
|
| 238 |
+
|
| 239 |
+
system = messages[0]["content"]
|
| 240 |
+
user = messages[1]["content"]
|
| 241 |
+
|
| 242 |
+
assert "Output only the words this character says aloud in court." not in messages[0]["content"]
|
| 243 |
+
assert "You are an individual juror." in system
|
| 244 |
+
assert JUROR_PERSONAS["Karl Marx"] in user
|
| 245 |
+
assert "Mike OSS argued from SOC-E1." in user
|
| 246 |
+
assert "Return only the requested JSON object." in user
|
| 247 |
+
|
| 248 |
+
|
| 249 |
+
def test_model_cleaner_extracts_final_speech_after_analysis_channel():
|
| 250 |
+
text = clean_model_text(
|
| 251 |
+
"analysis\nI should reason about the case first.\n\nfinal\nI stand for the respondent, and SOC-E1 leaves doubt."
|
| 252 |
+
)
|
| 253 |
+
|
| 254 |
+
assert text == "I stand for the respondent, and SOC-E1 leaves doubt."
|
| 255 |
+
assert "analysis" not in text.lower()
|
| 256 |
+
|
| 257 |
+
|
| 258 |
+
def test_model_cleaner_rejects_visible_analysis_without_final_speech():
|
| 259 |
+
def analysis_runner(**kwargs):
|
| 260 |
+
return ModelResult(
|
| 261 |
+
text="analysis: I should think through the case before answering.",
|
| 262 |
+
input_text="SYSTEM:\nanalysis leak",
|
| 263 |
+
call=ModelCall(
|
| 264 |
+
model=kwargs["model"],
|
| 265 |
+
provider=kwargs.get("provider", "test"),
|
| 266 |
+
ok=True,
|
| 267 |
+
latency_ms=1,
|
| 268 |
+
prompt_hash="test-prompt",
|
| 269 |
+
),
|
| 270 |
+
)
|
| 271 |
+
|
| 272 |
+
with pytest.raises(RequiredModelError):
|
| 273 |
+
next(stream_trial(TrialRequest(case_id="socrates"), model_runner=analysis_runner))
|
| 274 |
+
|
| 275 |
+
|
| 276 |
+
def test_model_cleaner_removes_instruction_echo_when_dialogue_remains():
|
| 277 |
+
text = clean_model_text(
|
| 278 |
+
"I will now announce the case as requested, while maintaining the theatrical but clear tone required. "
|
| 279 |
+
"I will speak as Clerk Meridian in first person, starting with a pronoun.\n\n"
|
| 280 |
+
"I call The Polis v. Socrates before this court."
|
| 281 |
+
)
|
| 282 |
+
|
| 283 |
+
assert text == "I call The Polis v. Socrates before this court."
|
| 284 |
+
|
| 285 |
+
|
| 286 |
+
def test_model_cleaner_rejects_instruction_echo_without_dialogue():
|
| 287 |
+
with pytest.raises(Exception, match="echoed instructions"):
|
| 288 |
+
clean_model_text(
|
| 289 |
+
"I will now announce the case as requested, while maintaining the theatrical but clear tone required. "
|
| 290 |
+
"I will speak as Clerk Meridian in first person, starting with a pronoun."
|
| 291 |
+
)
|
| 292 |
+
|
| 293 |
+
|
| 294 |
def test_required_model_failure_stops_trial_without_canned_dialogue():
|
| 295 |
def failing_runner(**kwargs):
|
| 296 |
return ModelResult(
|
|
|
|
| 313 |
def test_invalid_jury_output_stops_trial_without_fallback_votes():
|
| 314 |
def invalid_jury_runner(**kwargs):
|
| 315 |
result = fake_model_runner(**kwargs)
|
| 316 |
+
if kwargs["role"] == "juror":
|
| 317 |
result.text = "the jury refuses structured output"
|
| 318 |
return result
|
| 319 |
|
tests/test_ui_rendering.py
CHANGED
|
@@ -1,10 +1,11 @@
|
|
| 1 |
import inspect
|
|
|
|
| 2 |
from pathlib import Path
|
| 3 |
|
| 4 |
from PIL import Image
|
| 5 |
|
| 6 |
import app
|
| 7 |
-
from sovereign_bench.models import AgentTurn, EvidenceItem, JurorVote, TrialEvent
|
| 8 |
|
| 9 |
|
| 10 |
OLD_CARD_CLASSES = [
|
|
@@ -71,6 +72,32 @@ def _speaker_event(agent: str, phase: str = "questions") -> TrialEvent:
|
|
| 71 |
)
|
| 72 |
|
| 73 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
def test_lower_tab_renderers_emit_plain_text_classes():
|
| 75 |
event = _event_with_lower_tab_data()
|
| 76 |
html = "\n".join(
|
|
@@ -101,6 +128,12 @@ def test_download_controls_are_not_wired_into_app():
|
|
| 101 |
assert "Download agent trace" not in source
|
| 102 |
|
| 103 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 104 |
def test_courtroom_splits_six_jurors_between_side_benches():
|
| 105 |
html = app.render_court([_event_with_lower_tab_data()], started=True)
|
| 106 |
|
|
@@ -131,10 +164,15 @@ def test_courtroom_renders_historical_judge_and_juror_assets():
|
|
| 131 |
|
| 132 |
assert "Marcus Aurelius" in html
|
| 133 |
assert "assets/characters/marcus-aurelius.png" in html
|
|
|
|
|
|
|
|
|
|
| 134 |
for name, image in app.JUROR_IMAGES.items():
|
| 135 |
assert name in html
|
| 136 |
assert image in html
|
| 137 |
assert html.count("class='juror-portrait'") == 6
|
|
|
|
|
|
|
| 138 |
|
| 139 |
|
| 140 |
def test_courtroom_renders_foreground_fences_and_judge_table_above_characters():
|
|
@@ -146,6 +184,82 @@ def test_courtroom_renders_foreground_fences_and_judge_table_above_characters():
|
|
| 146 |
assert ".foreground-props {\n position: absolute;\n inset: 0;\n z-index: 13;" in app.CSS
|
| 147 |
assert ".puppet {\n --skin: #c99257;" in app.CSS
|
| 148 |
assert "z-index: 8;" in app.CSS
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 149 |
|
| 150 |
|
| 151 |
def test_foreground_prop_assets_have_real_transparency():
|
|
@@ -161,13 +275,67 @@ def test_foreground_prop_assets_have_real_transparency():
|
|
| 161 |
|
| 162 |
|
| 163 |
def test_latest_speaker_sets_stage_class_and_speech_bubble():
|
| 164 |
-
html = app.render_court([_speaker_event("
|
| 165 |
|
| 166 |
assert "speaker-auric" in html
|
| 167 |
-
assert "class='speech-bubble'" in html
|
| 168 |
-
assert "
|
|
|
|
|
|
|
|
|
|
| 169 |
assert "puppet auric active walking" in html
|
| 170 |
assert "puppet sable active" not in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 171 |
|
| 172 |
|
| 173 |
def test_individual_juror_can_be_active_speaker():
|
|
@@ -199,7 +367,19 @@ def test_individual_juror_can_be_active_speaker():
|
|
| 199 |
|
| 200 |
assert "speaker-karl-marx" in html
|
| 201 |
assert "<a class='juror active'" in html
|
|
|
|
| 202 |
assert "Liable. E1 carries the record." in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 203 |
|
| 204 |
|
| 205 |
def test_lawyer_movement_css_is_speaker_specific_not_phase_wide():
|
|
@@ -209,27 +389,106 @@ def test_lawyer_movement_css_is_speaker_specific_not_phase_wide():
|
|
| 209 |
assert ".phase-opening .puppet.sable" not in app.CSS
|
| 210 |
|
| 211 |
|
| 212 |
-
def
|
| 213 |
-
assert ".episode-book
|
| 214 |
-
assert "
|
|
|
|
|
|
|
|
|
|
|
|
|
| 215 |
assert ".puppet.auric {\n left: 24%;\n top: 87%;" in app.CSS
|
| 216 |
-
assert ".speaker-auric .puppet.auric {\n left: 43%;\n top:
|
| 217 |
-
assert ".puppet.
|
| 218 |
-
assert ".
|
| 219 |
-
assert ".puppet.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 220 |
assert ".puppet.auric {\n left: 20%;\n top: 970px;" in app.CSS
|
| 221 |
-
assert ".puppet.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 222 |
|
| 223 |
|
| 224 |
def test_run_ui_yields_five_outputs_without_download_status(monkeypatch):
|
| 225 |
event = _event_with_lower_tab_data()
|
| 226 |
monkeypatch.setattr(app, "get_events", lambda request: iter([event]))
|
|
|
|
| 227 |
|
| 228 |
-
outputs = list(app.run_ui("Trial of Socrates", "", "", "swift", True))
|
| 229 |
|
| 230 |
assert outputs
|
| 231 |
assert all(len(output) == 5 for output in outputs)
|
| 232 |
-
assert outputs[
|
|
|
|
| 233 |
assert outputs[-1][-1] == "Verdict sealed."
|
| 234 |
assert "download" not in outputs[-1][-1].lower()
|
| 235 |
|
|
@@ -241,12 +500,48 @@ def test_run_ui_stops_with_model_unavailable_error(monkeypatch):
|
|
| 241 |
|
| 242 |
monkeypatch.setattr(app, "get_events", broken_events)
|
| 243 |
|
| 244 |
-
outputs = list(app.run_ui("Trial of Socrates", "", "", "swift", True))
|
| 245 |
|
| 246 |
assert outputs[-1][-1] == "Model response required. Trial stopped: Marcus Aurelius unavailable: offline"
|
| 247 |
assert "Claimant score" not in outputs[-1][0]
|
| 248 |
|
| 249 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 250 |
def test_court_renders_sound_toggle():
|
| 251 |
html = app.render_court([])
|
| 252 |
|
|
|
|
| 1 |
import inspect
|
| 2 |
+
import json
|
| 3 |
from pathlib import Path
|
| 4 |
|
| 5 |
from PIL import Image
|
| 6 |
|
| 7 |
import app
|
| 8 |
+
from sovereign_bench.models import AgentTurn, EvidenceItem, JurorVote, TrialEvent, Verdict
|
| 9 |
|
| 10 |
|
| 11 |
OLD_CARD_CLASSES = [
|
|
|
|
| 72 |
)
|
| 73 |
|
| 74 |
|
| 75 |
+
def _verdict_event(finding: str = "liable") -> TrialEvent:
|
| 76 |
+
return TrialEvent(
|
| 77 |
+
phase="verdict",
|
| 78 |
+
title="The Court Announces Judgment",
|
| 79 |
+
body="Judgment is announced.",
|
| 80 |
+
verdict=Verdict(
|
| 81 |
+
finding=finding,
|
| 82 |
+
decree="The court enters the final judgment.",
|
| 83 |
+
rationale="The jury majority decides the record.",
|
| 84 |
+
evidence_ids=["E1"],
|
| 85 |
+
uncertainty="Some uncertainty remains.",
|
| 86 |
+
remedy="Record the judgment.",
|
| 87 |
+
),
|
| 88 |
+
turns=[
|
| 89 |
+
AgentTurn(
|
| 90 |
+
agent=app.JUDGE_NAME,
|
| 91 |
+
role="verdict writer",
|
| 92 |
+
content="The judgment of the court is guilty.",
|
| 93 |
+
model="test-model",
|
| 94 |
+
confidence=0.9,
|
| 95 |
+
input="SYSTEM:\nAnnounce verdict.",
|
| 96 |
+
)
|
| 97 |
+
],
|
| 98 |
+
)
|
| 99 |
+
|
| 100 |
+
|
| 101 |
def test_lower_tab_renderers_emit_plain_text_classes():
|
| 102 |
event = _event_with_lower_tab_data()
|
| 103 |
html = "\n".join(
|
|
|
|
| 128 |
assert "Download agent trace" not in source
|
| 129 |
|
| 130 |
|
| 131 |
+
def test_case_dropdown_only_exposes_demo_and_custom_cases():
|
| 132 |
+
assert list(app.CASE_OPTIONS) == ["Trial of Socrates", "Greg Heffley vs Mom", "Custom"]
|
| 133 |
+
assert "The People v. Barnaby Buttons" not in app.CASE_OPTIONS
|
| 134 |
+
assert "Live Search Tribunal" not in app.CASE_OPTIONS
|
| 135 |
+
|
| 136 |
+
|
| 137 |
def test_courtroom_splits_six_jurors_between_side_benches():
|
| 138 |
html = app.render_court([_event_with_lower_tab_data()], started=True)
|
| 139 |
|
|
|
|
| 164 |
|
| 165 |
assert "Marcus Aurelius" in html
|
| 166 |
assert "assets/characters/marcus-aurelius.png" in html
|
| 167 |
+
assert "<img class='puppet-portrait' src='/gradio_api/file=assets/characters/marcus-aurelius.png'" in html
|
| 168 |
+
assert ".puppet.judge::before,\n.puppet.judge::after {\n display: none;\n}" in app.CSS
|
| 169 |
+
assert ".puppet.judge .mouth {\n display: none;\n}" in app.CSS
|
| 170 |
for name, image in app.JUROR_IMAGES.items():
|
| 171 |
assert name in html
|
| 172 |
assert image in html
|
| 173 |
assert html.count("class='juror-portrait'") == 6
|
| 174 |
+
assert "class='juror-face'" not in html
|
| 175 |
+
assert "class='juror-body'" not in html
|
| 176 |
|
| 177 |
|
| 178 |
def test_courtroom_renders_foreground_fences_and_judge_table_above_characters():
|
|
|
|
| 184 |
assert ".foreground-props {\n position: absolute;\n inset: 0;\n z-index: 13;" in app.CSS
|
| 185 |
assert ".puppet {\n --skin: #c99257;" in app.CSS
|
| 186 |
assert "z-index: 8;" in app.CSS
|
| 187 |
+
assert ".puppet.clerk {\n left: 43%;\n top: 66%;\n z-index: 14;" in app.CSS
|
| 188 |
+
|
| 189 |
+
|
| 190 |
+
def test_trial_progress_defaults_to_pretrial_and_renders_all_stages():
|
| 191 |
+
html = app.render_court([])
|
| 192 |
+
|
| 193 |
+
assert "class='trial-progress'" in html
|
| 194 |
+
assert "data-phase='pretrial' aria-current='step'" in html
|
| 195 |
+
for _key, label in app.TRIAL_PROGRESS_STAGES:
|
| 196 |
+
assert label in html
|
| 197 |
+
|
| 198 |
+
|
| 199 |
+
def test_trial_progress_marks_questions_current():
|
| 200 |
+
html = app.render_court([_speaker_event("Mike OSS", phase="questions")], started=True)
|
| 201 |
+
|
| 202 |
+
assert "class='trial-progress-segment current' data-phase='questions' aria-current='step'" in html
|
| 203 |
+
assert "data-phase='evidence'" in html
|
| 204 |
+
|
| 205 |
+
|
| 206 |
+
def test_trial_progress_marks_deliberation_current():
|
| 207 |
+
html = app.render_court([_event_with_lower_tab_data()], started=True)
|
| 208 |
+
|
| 209 |
+
assert "class='trial-progress-segment current' data-phase='deliberation' aria-current='step'" in html
|
| 210 |
+
assert "class='trial-progress-segment complete' data-phase='questions'" in html
|
| 211 |
+
|
| 212 |
+
|
| 213 |
+
def test_trial_progress_marks_verdict_current_and_complete():
|
| 214 |
+
html = app.render_court([_speaker_event(app.JUDGE_NAME, phase="verdict")], started=True)
|
| 215 |
+
|
| 216 |
+
assert "class='trial-progress-segment current complete' data-phase='verdict' aria-current='step'" in html
|
| 217 |
+
assert "class='trial-progress-segment complete' data-phase='deliberation'" in html
|
| 218 |
+
|
| 219 |
+
|
| 220 |
+
def test_verdict_popup_renders_only_when_final_verdict_is_revealed():
|
| 221 |
+
event = _verdict_event("liable")
|
| 222 |
+
|
| 223 |
+
announcement = app.render_court([event], started=True)
|
| 224 |
+
sealed = app.render_court([event], started=True, show_verdict_popup=True)
|
| 225 |
+
|
| 226 |
+
assert "class='speech-bubble active-dialogue speaker-judge'" in announcement
|
| 227 |
+
assert "class='verdict-popup'" not in announcement
|
| 228 |
+
assert "class='speech-bubble active-dialogue speaker-judge'" in sealed
|
| 229 |
+
assert "class='verdict-popup'" in sealed
|
| 230 |
+
assert "data-finding='liable'" in sealed
|
| 231 |
+
assert "Verdict: Guilty" in sealed
|
| 232 |
+
|
| 233 |
+
|
| 234 |
+
def test_run_ui_reveals_verdict_popup_after_judge_speech(monkeypatch):
|
| 235 |
+
event = _verdict_event("not_liable")
|
| 236 |
+
monkeypatch.setattr(app, "get_events", lambda request: iter([event]))
|
| 237 |
+
monkeypatch.setattr(app, "_reading_duration", lambda text: 0)
|
| 238 |
+
|
| 239 |
+
outputs = list(app.run_ui("Trial of Socrates", "", "", "", "swift", True))
|
| 240 |
+
|
| 241 |
+
assert "class='speech-bubble active-dialogue speaker-judge'" in outputs[1][0]
|
| 242 |
+
assert "class='verdict-popup'" not in outputs[1][0]
|
| 243 |
+
assert outputs[-1][-1] == "Verdict sealed."
|
| 244 |
+
assert "class='verdict-popup'" in outputs[-1][0]
|
| 245 |
+
assert "Verdict: Not Guilty" in outputs[-1][0]
|
| 246 |
+
|
| 247 |
+
|
| 248 |
+
def test_trial_progress_ignores_unknown_phase_without_extra_segment():
|
| 249 |
+
html = app.render_court([_speaker_event("Clerk Meridian", phase="appeal")], started=True)
|
| 250 |
+
|
| 251 |
+
assert "class='trial-progress'" in html
|
| 252 |
+
assert html.count("class='trial-progress-segment") == len(app.TRIAL_PROGRESS_STAGES)
|
| 253 |
+
assert "aria-current='step'" not in html
|
| 254 |
+
assert "class='trial-progress-segment' data-phase='appeal'" not in html
|
| 255 |
+
|
| 256 |
+
|
| 257 |
+
def test_trial_progress_css_is_fixed_and_translucent_theme_matched():
|
| 258 |
+
assert ".trial-progress {\n position: fixed;\n top: 0;" in app.CSS
|
| 259 |
+
assert "background: rgba(23, 13, 8, .58);" in app.CSS
|
| 260 |
+
assert "backdrop-filter: blur(8px);" in app.CSS
|
| 261 |
+
assert "background: #ffd675;" in app.CSS
|
| 262 |
+
assert ".trial-progress-abbrev {\n display: inline;" in app.CSS
|
| 263 |
|
| 264 |
|
| 265 |
def test_foreground_prop_assets_have_real_transparency():
|
|
|
|
| 275 |
|
| 276 |
|
| 277 |
def test_latest_speaker_sets_stage_class_and_speech_bubble():
|
| 278 |
+
html = app.render_court([_speaker_event("Mike OSS", phase="claims")], started=True)
|
| 279 |
|
| 280 |
assert "speaker-auric" in html
|
| 281 |
+
assert "class='speech-bubble active-dialogue speaker-auric'" in html
|
| 282 |
+
assert "data-speaker='Mike OSS'" in html
|
| 283 |
+
assert "<strong>Mike OSS</strong>" in html
|
| 284 |
+
assert "test speaker" in html
|
| 285 |
+
assert "Mike OSS has the visible floor." in html
|
| 286 |
assert "puppet auric active walking" in html
|
| 287 |
assert "puppet sable active" not in html
|
| 288 |
+
assert html.count("class='speech-bubble") == 1
|
| 289 |
+
assert html.find("class='foreground-props'") < html.find("class='speech-bubble active-dialogue")
|
| 290 |
+
assert ".speech-bubble.active-dialogue,\n.speech-bubble.active-dialogue * {\n color: #141413 !important;\n}" in app.CSS
|
| 291 |
+
assert "border: 2px solid #141413;" in app.CSS
|
| 292 |
+
assert "font-size: 12px;" in app.CSS
|
| 293 |
+
|
| 294 |
+
|
| 295 |
+
def test_speech_bubble_uses_full_turn_content_not_event_body():
|
| 296 |
+
long_text = " ".join(["The record speaks plainly"] * 18) + " with a final visible phrase."
|
| 297 |
+
event = TrialEvent(
|
| 298 |
+
phase="questions",
|
| 299 |
+
title="Counsel answers",
|
| 300 |
+
body="Narration only, not spoken dialogue.",
|
| 301 |
+
turns=[
|
| 302 |
+
AgentTurn(
|
| 303 |
+
agent="Harvey Vector",
|
| 304 |
+
role="respondent advocate",
|
| 305 |
+
content=long_text,
|
| 306 |
+
model="test-model",
|
| 307 |
+
confidence=0.9,
|
| 308 |
+
)
|
| 309 |
+
],
|
| 310 |
+
)
|
| 311 |
+
html = app.render_court([event], started=True)
|
| 312 |
+
bubble = html[html.index("<div class='speech-bubble") : html.index("<div class='gallery-benches")]
|
| 313 |
+
|
| 314 |
+
assert "with a final visible phrase." in bubble
|
| 315 |
+
assert "Narration only" not in bubble
|
| 316 |
+
assert "..." not in bubble
|
| 317 |
+
|
| 318 |
+
|
| 319 |
+
def test_pending_speaker_renders_single_preparing_bubble():
|
| 320 |
+
pending = app.SpeakerCue(
|
| 321 |
+
name="Harvey Vector",
|
| 322 |
+
role="respondent advocate",
|
| 323 |
+
text="Harvey Vector is preparing a response.",
|
| 324 |
+
pending=True,
|
| 325 |
+
)
|
| 326 |
+
html = app.render_court([], started=True, pending_speaker=pending)
|
| 327 |
+
|
| 328 |
+
assert "class='speech-bubble active-dialogue speaker-sable pending'" in html
|
| 329 |
+
assert "data-pending='true'" in html
|
| 330 |
+
assert "Harvey Vector is preparing a response." in html
|
| 331 |
+
assert "puppet sable active walking" in html
|
| 332 |
+
assert html.count("class='speech-bubble") == 1
|
| 333 |
+
|
| 334 |
+
|
| 335 |
+
def test_reading_duration_scales_with_words_and_caps():
|
| 336 |
+
assert app._reading_duration("short line") == app.MIN_READ_SECONDS
|
| 337 |
+
assert app._reading_duration("word " * 18) > app.MIN_READ_SECONDS
|
| 338 |
+
assert app._reading_duration("word " * 200) == app.MAX_READ_SECONDS
|
| 339 |
|
| 340 |
|
| 341 |
def test_individual_juror_can_be_active_speaker():
|
|
|
|
| 367 |
|
| 368 |
assert "speaker-karl-marx" in html
|
| 369 |
assert "<a class='juror active'" in html
|
| 370 |
+
assert "class='speech-bubble active-dialogue speaker-karl-marx juror-dialogue'" in html
|
| 371 |
assert "Liable. E1 carries the record." in html
|
| 372 |
+
assert html.count("class='speech-bubble") == 1
|
| 373 |
+
|
| 374 |
+
|
| 375 |
+
def test_juror_speech_bubbles_anchor_above_side_benches():
|
| 376 |
+
assert ".speech-bubble.active-dialogue.juror-dialogue {\n top: 42%;" in app.CSS
|
| 377 |
+
assert ".speech-bubble.active-dialogue.speaker-karl-marx,\n.speech-bubble.active-dialogue.speaker-john-stuart-mill,\n.speech-bubble.active-dialogue.speaker-confucius {\n left: 1.5%;" in app.CSS
|
| 378 |
+
assert ".speech-bubble.active-dialogue.speaker-cleopatra-vii,\n.speech-bubble.active-dialogue.speaker-niccolo-machiavelli,\n.speech-bubble.active-dialogue.speaker-jensen-huang {\n right: 1.5%;" in app.CSS
|
| 379 |
+
assert "--bubble-tail-x: 19%;" in app.CSS
|
| 380 |
+
assert "--bubble-tail-x: 81%;" in app.CSS
|
| 381 |
+
assert ".speech-bubble.active-dialogue.juror-dialogue,\n .speech-bubble.active-dialogue.speaker-karl-marx" in app.CSS
|
| 382 |
+
assert "top: 500px;" in app.CSS
|
| 383 |
|
| 384 |
|
| 385 |
def test_lawyer_movement_css_is_speaker_specific_not_phase_wide():
|
|
|
|
| 389 |
assert ".phase-opening .puppet.sable" not in app.CSS
|
| 390 |
|
| 391 |
|
| 392 |
+
def test_closed_book_and_key_characters_align_with_judge_table():
|
| 393 |
+
assert ".episode-book {\n position: absolute;\n left: 50%;\n top: 122px;\n z-index: 14;" in app.CSS
|
| 394 |
+
assert "width: min(980px, calc(100% - 32px));" in app.CSS
|
| 395 |
+
assert ".episode-book.closed {\n top: 50%;\n width: min(163px, 20vw);" in app.CSS
|
| 396 |
+
assert ".foreground-fence {\n bottom: -6.5%;\n width: 47%;" in app.CSS
|
| 397 |
+
assert ".judge-table-foreground {\n left: 50%;\n top: 20%;\n z-index: 1;\n width: 39.1%;" in app.CSS
|
| 398 |
+
assert ".puppet.judge {\n left: 50%;\n top: calc(40% + 156px);" in app.CSS
|
| 399 |
assert ".puppet.auric {\n left: 24%;\n top: 87%;" in app.CSS
|
| 400 |
+
assert ".speaker-auric .puppet.auric {\n left: 43%;\n top: 87%;" in app.CSS
|
| 401 |
+
assert ".puppet.sable {\n left: 75%;\n top: 87%;" in app.CSS
|
| 402 |
+
assert ".speaker-sable .puppet.sable {\n left: 75%;\n top: 87%;" in app.CSS
|
| 403 |
+
assert ".puppet.clerk {\n left: 43%;\n top: 66%;" in app.CSS
|
| 404 |
+
assert ".puppet.auditor" not in app.CSS
|
| 405 |
+
assert ".episode-book.closed {\n top: 640px;\n width: 140px;" in app.CSS
|
| 406 |
+
assert ".episode-book {\n top: 218px;\n width: min(680px, calc(100% - 20px));" in app.CSS
|
| 407 |
+
assert ".foreground-fence {\n bottom: -66px;\n width: 64%;" in app.CSS
|
| 408 |
+
assert ".judge-table-foreground {\n top: 213px;\n width: 646px;" in app.CSS
|
| 409 |
assert ".puppet.auric {\n left: 20%;\n top: 970px;" in app.CSS
|
| 410 |
+
assert ".puppet.sable {\n left: 80%;\n top: 970px;" in app.CSS
|
| 411 |
+
assert ".speaker-sable .puppet.sable {\n left: 80%;\n top: 970px;" in app.CSS
|
| 412 |
+
assert ".puppet.judge {\n top: 576px;" not in app.CSS
|
| 413 |
+
assert ".puppet.sable {\n left: 80%;\n top: 640px;" not in app.CSS
|
| 414 |
+
assert ".speaker-sable .puppet.sable {\n left: 80%;\n top: 640px;" not in app.CSS
|
| 415 |
+
assert ".puppet.clerk {\n left: 35%;\n top: 880px;" in app.CSS
|
| 416 |
+
assert ".speech-bubble.active-dialogue.speaker-auditor" not in app.CSS
|
| 417 |
+
|
| 418 |
+
|
| 419 |
+
def test_open_docket_book_renders_text_above_book_art():
|
| 420 |
+
html = app.render_court([])
|
| 421 |
+
|
| 422 |
+
assert "class='episode-book'" in html
|
| 423 |
+
assert "class='book-open-content'" in html
|
| 424 |
+
assert "Trial details" in html
|
| 425 |
+
assert "Evidence" in html
|
| 426 |
+
|
| 427 |
+
|
| 428 |
+
def test_greg_case_preview_uses_cached_context_and_evidence_columns():
|
| 429 |
+
html = app.render_case_preview("Greg Heffley vs Mom")
|
| 430 |
+
|
| 431 |
+
assert "Greg Heffley v. Mom" in html
|
| 432 |
+
assert "diary" in html
|
| 433 |
+
assert "Evidence for Greg Heffley" in html
|
| 434 |
+
assert "Evidence for Susan Heffley" in html
|
| 435 |
+
|
| 436 |
+
|
| 437 |
+
def test_custom_case_preview_renders_fillable_book_fields():
|
| 438 |
+
html = app.render_case_preview("Custom")
|
| 439 |
+
|
| 440 |
+
assert "episode-book custom-book" in html
|
| 441 |
+
assert "book-context-field" in html
|
| 442 |
+
assert html.count("book-claimant-field") == 3
|
| 443 |
+
assert html.count("book-respondent-field") == 3
|
| 444 |
+
|
| 445 |
+
|
| 446 |
+
def test_custom_payload_builds_trial_request_packet(monkeypatch):
|
| 447 |
+
captured = {}
|
| 448 |
+
|
| 449 |
+
def fake_events(request):
|
| 450 |
+
captured["request"] = request
|
| 451 |
+
return iter([_event_with_lower_tab_data()])
|
| 452 |
+
|
| 453 |
+
monkeypatch.setattr(app, "get_events", fake_events)
|
| 454 |
+
monkeypatch.setattr(app, "_reading_duration", lambda text: 0)
|
| 455 |
+
payload = json.dumps(
|
| 456 |
+
{
|
| 457 |
+
"context": "A missing bicycle is traced to a disputed garage visit.",
|
| 458 |
+
"claimant_evidence": ["Garage text", "", "Scuffed tire mark"],
|
| 459 |
+
"respondent_evidence": ["Neighbor saw bike later", "", ""],
|
| 460 |
+
}
|
| 461 |
+
)
|
| 462 |
+
|
| 463 |
+
outputs = list(app.run_ui("Custom", "", "", payload, "swift", True))
|
| 464 |
+
|
| 465 |
+
assert outputs[-1][-1] == "Verdict sealed."
|
| 466 |
+
request = captured["request"]
|
| 467 |
+
assert request.case_id == "custom"
|
| 468 |
+
assert request.custom_case is not None
|
| 469 |
+
assert request.custom_case.context.startswith("A missing bicycle")
|
| 470 |
+
assert [item.supports for item in request.custom_case.evidence] == ["claimant", "claimant", "respondent"]
|
| 471 |
+
|
| 472 |
+
|
| 473 |
+
def test_custom_payload_requires_context_and_both_evidence_sides():
|
| 474 |
+
payload = json.dumps({"context": "", "claimant_evidence": ["Only one side"], "respondent_evidence": []})
|
| 475 |
+
|
| 476 |
+
outputs = list(app.run_ui("Custom", "", "", payload, "swift", True))
|
| 477 |
+
|
| 478 |
+
assert outputs[-1][-1] == "Custom requires a trial details paragraph."
|
| 479 |
|
| 480 |
|
| 481 |
def test_run_ui_yields_five_outputs_without_download_status(monkeypatch):
|
| 482 |
event = _event_with_lower_tab_data()
|
| 483 |
monkeypatch.setattr(app, "get_events", lambda request: iter([event]))
|
| 484 |
+
monkeypatch.setattr(app, "_reading_duration", lambda text: 0)
|
| 485 |
|
| 486 |
+
outputs = list(app.run_ui("Trial of Socrates", "", "", "", "swift", True))
|
| 487 |
|
| 488 |
assert outputs
|
| 489 |
assert all(len(output) == 5 for output in outputs)
|
| 490 |
+
assert outputs[0][-1] == "Clerk Meridian is preparing their response."
|
| 491 |
+
assert outputs[1][-1] == "Step 1: Nemotron Jury - Jury weighs the record"
|
| 492 |
assert outputs[-1][-1] == "Verdict sealed."
|
| 493 |
assert "download" not in outputs[-1][-1].lower()
|
| 494 |
|
|
|
|
| 500 |
|
| 501 |
monkeypatch.setattr(app, "get_events", broken_events)
|
| 502 |
|
| 503 |
+
outputs = list(app.run_ui("Trial of Socrates", "", "", "", "swift", True))
|
| 504 |
|
| 505 |
assert outputs[-1][-1] == "Model response required. Trial stopped: Marcus Aurelius unavailable: offline"
|
| 506 |
assert "Claimant score" not in outputs[-1][0]
|
| 507 |
|
| 508 |
|
| 509 |
+
def test_remote_events_uses_default_modal_endpoint_without_local_token(monkeypatch):
|
| 510 |
+
captured = {}
|
| 511 |
+
|
| 512 |
+
class FakeResponse:
|
| 513 |
+
def __enter__(self):
|
| 514 |
+
return self
|
| 515 |
+
|
| 516 |
+
def __exit__(self, exc_type, exc, traceback):
|
| 517 |
+
return False
|
| 518 |
+
|
| 519 |
+
def raise_for_status(self):
|
| 520 |
+
return None
|
| 521 |
+
|
| 522 |
+
def iter_lines(self):
|
| 523 |
+
event = _speaker_event("Clerk Meridian", phase="intake")
|
| 524 |
+
yield json.dumps(event.model_dump())
|
| 525 |
+
|
| 526 |
+
def fake_stream(method, endpoint, json, timeout):
|
| 527 |
+
captured["method"] = method
|
| 528 |
+
captured["endpoint"] = endpoint
|
| 529 |
+
captured["payload"] = json
|
| 530 |
+
captured["timeout"] = timeout
|
| 531 |
+
return FakeResponse()
|
| 532 |
+
|
| 533 |
+
monkeypatch.delenv("MODAL_TRIAL_URL", raising=False)
|
| 534 |
+
monkeypatch.delenv("HF_TOKEN", raising=False)
|
| 535 |
+
monkeypatch.setattr(app.httpx, "stream", fake_stream)
|
| 536 |
+
|
| 537 |
+
event = next(app.get_events(app.TrialRequest(case_id="socrates"), delay=0.0))
|
| 538 |
+
|
| 539 |
+
assert captured["method"] == "POST"
|
| 540 |
+
assert captured["endpoint"] == app.DEFAULT_MODAL_TRIAL_URL
|
| 541 |
+
assert captured["timeout"] == 900.0
|
| 542 |
+
assert event.turns[0].agent == "Clerk Meridian"
|
| 543 |
+
|
| 544 |
+
|
| 545 |
def test_court_renders_sound_toggle():
|
| 546 |
html = app.render_court([])
|
| 547 |
|