GitHub Actions commited on
Commit ·
da0bed0
1
Parent(s): caad2c6
Sync from GitHub
Browse files- hf-space/.gitignore +1 -0
- hf-space/Dockerfile +0 -1
- hf-space/README.md +57 -78
- hf-space/hf-space/app.py +890 -750
- hf-space/hf-space/hf-space/app.py +24 -2
- hf-space/hf-space/hf-space/hf-space/README.md +9 -0
- hf-space/hf-space/hf-space/hf-space/hf-space/.env.example +7 -0
- hf-space/hf-space/hf-space/hf-space/hf-space/.github/workflows/sync-to-hf.yml +35 -0
- hf-space/hf-space/hf-space/hf-space/hf-space/.gitignore +7 -0
- hf-space/hf-space/hf-space/hf-space/hf-space/Dockerfile +11 -0
- hf-space/hf-space/hf-space/hf-space/hf-space/README.md +84 -10
- hf-space/hf-space/hf-space/hf-space/hf-space/app.py +853 -0
- hf-space/hf-space/hf-space/hf-space/hf-space/hf-space/.gitattributes +35 -0
- hf-space/hf-space/hf-space/hf-space/hf-space/hf-space/README.md +10 -0
- hf-space/hf-space/hf-space/hf-space/hf-space/requirements.txt +5 -0
- hf-space/hf-space/hf-space/hf-space/hf-space/scripts/seed.py +28 -0
- hf-space/requirements.txt +2 -2
hf-space/.gitignore
CHANGED
|
@@ -2,6 +2,7 @@ __pycache__/
|
|
| 2 |
*.pyc
|
| 3 |
.streamlit/
|
| 4 |
data/
|
|
|
|
| 5 |
exports/
|
| 6 |
.env
|
| 7 |
.DS_Store
|
|
|
|
| 2 |
*.pyc
|
| 3 |
.streamlit/
|
| 4 |
data/
|
| 5 |
+
drafts/
|
| 6 |
exports/
|
| 7 |
.env
|
| 8 |
.DS_Store
|
hf-space/Dockerfile
CHANGED
|
@@ -1,7 +1,6 @@
|
|
| 1 |
FROM python:3.11-slim
|
| 2 |
|
| 3 |
WORKDIR /app
|
| 4 |
-
|
| 5 |
COPY . /app
|
| 6 |
|
| 7 |
RUN pip install --no-cache-dir -r requirements.txt
|
|
|
|
| 1 |
FROM python:3.11-slim
|
| 2 |
|
| 3 |
WORKDIR /app
|
|
|
|
| 4 |
COPY . /app
|
| 5 |
|
| 6 |
RUN pip install --no-cache-dir -r requirements.txt
|
hf-space/README.md
CHANGED
|
@@ -1,93 +1,72 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
-
|
| 17 |
-
-
|
| 18 |
-
-
|
| 19 |
-
-
|
| 20 |
-
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
``
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
That design avoids write conflicts between annotators because each submission is a new file, not an overwrite of a shared database row. Repository files on the Hub are versioned, and the Hub supports uploading files to dataset repositories. citeturn322583view1turn322583view4
|
| 41 |
-
|
| 42 |
-
## Local run
|
| 43 |
|
| 44 |
```bash
|
| 45 |
pip install -r requirements.txt
|
| 46 |
streamlit run app.py
|
| 47 |
```
|
| 48 |
|
| 49 |
-
##
|
| 50 |
-
|
| 51 |
-
### 1. Create two dataset repositories
|
| 52 |
-
|
| 53 |
-
Create:
|
| 54 |
-
- one dataset repo for the **source / seed data**
|
| 55 |
-
- one dataset repo for the **annotations**
|
| 56 |
-
|
| 57 |
-
Hugging Face dataset repositories are created from the Hub UI, and dataset files plus revision history are stored in the repository. citeturn322583view1
|
| 58 |
-
|
| 59 |
-
### 2. Create a Space
|
| 60 |
-
|
| 61 |
-
Create a **Streamlit** Space and connect it to your GitHub repository. Spaces host apps directly on the Hub and support Streamlit as a built-in SDK. citeturn322583view2
|
| 62 |
-
|
| 63 |
-
### 3. Attach a Storage Bucket
|
| 64 |
-
|
| 65 |
-
Attach a Storage Bucket to the Space and mount it at `/data`.
|
| 66 |
|
| 67 |
-
|
| 68 |
|
| 69 |
-
### 4. Add secrets
|
| 70 |
-
|
| 71 |
-
In the Space settings, add:
|
| 72 |
-
- `HF_TOKEN` — a Hugging Face token with **write** permission
|
| 73 |
- `SOURCE_DATASET_REPO`
|
| 74 |
-
- `
|
|
|
|
| 75 |
- `ANNOTATION_REPO_ID`
|
|
|
|
| 76 |
|
| 77 |
-
|
| 78 |
-
|
| 79 |
-
|
| 80 |
|
| 81 |
-
|
| 82 |
|
| 83 |
-
|
| 84 |
|
| 85 |
-
|
| 86 |
-
- each submission creates a new JSON file in the annotation repo
|
| 87 |
-
- the Review page shows items with 2+ annotations
|
| 88 |
-
- the Dashboard shows per-annotator and per-domain progress
|
| 89 |
-
- exports are generated from the merged source + annotation view
|
| 90 |
|
| 91 |
-
|
| 92 |
-
|
| 93 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# LLM Annotation Platform
|
| 2 |
+
|
| 3 |
+
A simple Streamlit app for collaborative editing of a human-made distractor dataset.
|
| 4 |
+
|
| 5 |
+
## What it supports
|
| 6 |
+
|
| 7 |
+
- browse source data from a Hugging Face dataset repo or a local JSON/JSONL file
|
| 8 |
+
- load a row by index into an editor
|
| 9 |
+
- create a new blank entry
|
| 10 |
+
- edit:
|
| 11 |
+
- `domain`
|
| 12 |
+
- `scenario`
|
| 13 |
+
- `system_instruction`
|
| 14 |
+
- `conversation`
|
| 15 |
+
- `distractors`
|
| 16 |
+
- `distractors_multiturn`
|
| 17 |
+
- `conversation_with_distractors`
|
| 18 |
+
- mark the entry with a `split` value (`train` / `test`)
|
| 19 |
+
- save drafts in the HF Space bucket path (`/data/drafts`)
|
| 20 |
+
- submit each finished entry as a separate JSON file to a Hugging Face dataset repo
|
| 21 |
+
- optionally ask a local OpenAI-compatible LLM server such as LM Studio to draft one distractor at a time
|
| 22 |
+
|
| 23 |
+
## Output shape
|
| 24 |
+
|
| 25 |
+
The app keeps the source structure and adds provenance fields:
|
| 26 |
+
|
| 27 |
+
- `split`
|
| 28 |
+
- `_review_status`
|
| 29 |
+
- `_needs_human_review`
|
| 30 |
+
- `_annotator`
|
| 31 |
+
- `_source_repo`
|
| 32 |
+
- `_source_split`
|
| 33 |
+
- `_source_index`
|
| 34 |
+
- `_created_at`
|
| 35 |
+
- `_updated_at`
|
| 36 |
+
|
| 37 |
+
That means the final file can still be merged into one dataset later.
|
| 38 |
+
|
| 39 |
+
## Run locally
|
|
|
|
|
|
|
|
|
|
| 40 |
|
| 41 |
```bash
|
| 42 |
pip install -r requirements.txt
|
| 43 |
streamlit run app.py
|
| 44 |
```
|
| 45 |
|
| 46 |
+
## Environment variables
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
|
| 48 |
+
Set these in your GitHub repo / HF Space:
|
| 49 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
- `SOURCE_DATASET_REPO`
|
| 51 |
+
- `SOURCE_DATASET_SPLITS`
|
| 52 |
+
Example: `train,test`
|
| 53 |
- `ANNOTATION_REPO_ID`
|
| 54 |
+
- `HF_TOKEN`
|
| 55 |
|
| 56 |
+
Optional local LLM settings:
|
| 57 |
+
- `LLM_BASE_URL` is entered in the sidebar inside the app
|
| 58 |
+
- `LLM_MODEL` is entered in the sidebar inside the app
|
| 59 |
|
| 60 |
+
## HF Space setup
|
| 61 |
|
| 62 |
+
Use a Docker Space, mount persistent storage at `/data`, and set the environment variables above. The app stores drafts and submission logs in the bucket path.
|
| 63 |
|
| 64 |
+
## GitHub structure
|
|
|
|
|
|
|
|
|
|
|
|
|
| 65 |
|
| 66 |
+
```text
|
| 67 |
+
app.py
|
| 68 |
+
requirements.txt
|
| 69 |
+
README.md
|
| 70 |
+
Dockerfile
|
| 71 |
+
.streamlit/config.toml
|
| 72 |
+
```
|
hf-space/hf-space/app.py
CHANGED
|
@@ -3,6 +3,7 @@ from __future__ import annotations
|
|
| 3 |
import json
|
| 4 |
import os
|
| 5 |
import uuid
|
|
|
|
| 6 |
from datetime import datetime, timezone
|
| 7 |
from pathlib import Path
|
| 8 |
from typing import Any, Dict, List, Optional, Tuple
|
|
@@ -12,861 +13,1000 @@ import streamlit as st
|
|
| 12 |
from datasets import load_dataset
|
| 13 |
from huggingface_hub import HfApi, hf_hub_download
|
| 14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
APP_TITLE = "🧭 LLM Annotation Platform"
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
)
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
)
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
"
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
"
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
"transition_style": [
|
| 53 |
-
"abrupt",
|
| 54 |
-
"smooth bridge",
|
| 55 |
-
"follow-up clarification",
|
| 56 |
-
"rephrasing",
|
| 57 |
-
"escalation",
|
| 58 |
-
"roleplay",
|
| 59 |
-
"ambiguity exploitation",
|
| 60 |
-
"other",
|
| 61 |
-
],
|
| 62 |
-
"policy_target": [
|
| 63 |
-
"medical advice",
|
| 64 |
-
"financial advice",
|
| 65 |
-
"legal advice",
|
| 66 |
-
"competitor discussion",
|
| 67 |
-
"politics",
|
| 68 |
-
"unsafe content",
|
| 69 |
-
"personal data",
|
| 70 |
-
"company-specific policy",
|
| 71 |
-
"tone / style policy",
|
| 72 |
-
"other",
|
| 73 |
-
],
|
| 74 |
}
|
| 75 |
|
|
|
|
|
|
|
|
|
|
| 76 |
|
| 77 |
def now_iso() -> str:
|
| 78 |
-
return datetime.now(timezone.utc).isoformat()
|
| 79 |
|
| 80 |
|
| 81 |
-
def
|
| 82 |
-
|
|
|
|
|
|
|
|
|
|
| 83 |
|
| 84 |
|
| 85 |
-
def
|
| 86 |
-
|
|
|
|
|
|
|
|
|
|
| 87 |
|
| 88 |
|
| 89 |
-
def
|
| 90 |
-
|
| 91 |
-
safe_item = "".join(ch if ch.isalnum() or ch in "-_." else "_" for ch in item_id.strip()) or "item"
|
| 92 |
-
stamp = datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ")
|
| 93 |
-
return f"annotations/{safe_annotator}/{stamp}_{safe_item}_{uuid.uuid4().hex[:8]}.json"
|
| 94 |
|
| 95 |
|
| 96 |
-
def
|
| 97 |
-
|
| 98 |
-
|
|
|
|
|
|
|
|
|
|
| 99 |
|
| 100 |
|
| 101 |
-
def
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
|
|
|
| 105 |
|
| 106 |
|
| 107 |
-
def
|
| 108 |
-
if
|
| 109 |
-
return
|
| 110 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 111 |
|
| 112 |
|
| 113 |
-
def
|
| 114 |
-
|
| 115 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 116 |
|
| 117 |
|
| 118 |
-
def
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
except Exception:
|
| 125 |
-
return []
|
| 126 |
-
if not isinstance(turns, list):
|
| 127 |
-
return []
|
| 128 |
out = []
|
| 129 |
-
for
|
| 130 |
-
|
| 131 |
-
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
else:
|
| 135 |
-
out.append({"role": "unknown", "content": str(turn)})
|
| 136 |
return out
|
| 137 |
|
| 138 |
|
| 139 |
-
def
|
| 140 |
-
|
| 141 |
-
|
| 142 |
-
|
| 143 |
-
|
| 144 |
-
|
| 145 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 146 |
|
| 147 |
|
| 148 |
-
def
|
| 149 |
-
|
| 150 |
-
|
| 151 |
-
|
| 152 |
-
|
| 153 |
-
|
| 154 |
-
|
| 155 |
-
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
|
| 159 |
-
|
| 160 |
-
|
| 161 |
-
|
| 162 |
-
|
| 163 |
-
|
| 164 |
-
|
| 165 |
-
|
| 166 |
-
|
| 167 |
-
|
| 168 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 169 |
}
|
|
|
|
| 170 |
|
| 171 |
-
items = []
|
| 172 |
-
for distractor_index, d in enumerate(distractors):
|
| 173 |
-
bot_turn = ""
|
| 174 |
-
distractor_text = ""
|
| 175 |
-
if isinstance(d, dict):
|
| 176 |
-
bot_turn = str(
|
| 177 |
-
d.get("bot turn")
|
| 178 |
-
or d.get("bot_turn")
|
| 179 |
-
or d.get("assistant_turn")
|
| 180 |
-
or d.get("assistant")
|
| 181 |
-
or ""
|
| 182 |
-
)
|
| 183 |
-
distractor_text = str(
|
| 184 |
-
d.get("distractor")
|
| 185 |
-
or d.get("distractor user turn")
|
| 186 |
-
or d.get("user_turn")
|
| 187 |
-
or d.get("user")
|
| 188 |
-
or d.get("text")
|
| 189 |
-
or ""
|
| 190 |
-
)
|
| 191 |
-
else:
|
| 192 |
-
distractor_text = str(d)
|
| 193 |
|
| 194 |
-
|
| 195 |
-
|
| 196 |
-
|
| 197 |
-
|
| 198 |
-
"distractor_index": distractor_index,
|
| 199 |
-
"bot_turn": bot_turn,
|
| 200 |
-
"distractor_text": distractor_text,
|
| 201 |
-
}
|
| 202 |
-
)
|
| 203 |
-
return sample, items
|
| 204 |
|
| 205 |
|
| 206 |
-
|
| 207 |
-
|
| 208 |
-
|
| 209 |
-
|
| 210 |
-
|
| 211 |
-
|
| 212 |
-
|
| 213 |
-
return
|
| 214 |
|
| 215 |
|
| 216 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 217 |
with path.open("r", encoding="utf-8") as f:
|
| 218 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 219 |
|
| 220 |
|
| 221 |
-
def
|
| 222 |
-
|
| 223 |
-
|
| 224 |
-
|
| 225 |
-
|
| 226 |
-
|
|
|
|
|
|
|
| 227 |
|
| 228 |
-
cache_dir = cache_annotations_dir()
|
| 229 |
-
file_list = api().list_repo_files(annotation_repo_id, repo_type="dataset")
|
| 230 |
-
ann_files = [f for f in file_list if f.startswith("annotations/") and f.endswith(".json")]
|
| 231 |
|
| 232 |
-
|
| 233 |
-
|
| 234 |
-
|
| 235 |
-
|
| 236 |
-
|
| 237 |
-
|
| 238 |
-
|
| 239 |
-
|
| 240 |
-
|
| 241 |
-
|
| 242 |
-
|
| 243 |
-
|
| 244 |
-
|
| 245 |
-
|
| 246 |
-
|
| 247 |
-
|
| 248 |
-
|
| 249 |
-
|
| 250 |
-
|
| 251 |
-
|
| 252 |
-
|
| 253 |
-
|
| 254 |
-
|
| 255 |
-
|
| 256 |
-
|
| 257 |
-
|
| 258 |
-
|
| 259 |
-
|
| 260 |
-
|
| 261 |
-
|
| 262 |
-
|
| 263 |
-
|
| 264 |
-
|
| 265 |
-
|
| 266 |
-
|
| 267 |
-
|
| 268 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 269 |
|
| 270 |
-
return pd.DataFrame(rows) if rows else pd.DataFrame(columns=["item_id", "sample_id", "annotator", "labels", "notes", "status", "created_at", "file_path"])
|
| 271 |
|
|
|
|
|
|
|
|
|
|
| 272 |
|
| 273 |
-
def
|
| 274 |
-
|
| 275 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 276 |
with path.open("w", encoding="utf-8") as f:
|
| 277 |
json.dump(payload, f, ensure_ascii=False, indent=2)
|
| 278 |
return path
|
| 279 |
|
| 280 |
|
| 281 |
-
def
|
| 282 |
-
path =
|
| 283 |
if not path.exists():
|
| 284 |
return {}
|
| 285 |
try:
|
| 286 |
-
|
|
|
|
| 287 |
except Exception:
|
| 288 |
return {}
|
| 289 |
|
| 290 |
|
| 291 |
-
def
|
| 292 |
-
|
| 293 |
-
|
| 294 |
-
|
| 295 |
-
"policy_target": st.session_state.get(f"{prefix}policy_target", []),
|
| 296 |
-
"difficulty": int(st.session_state.get(f"{prefix}difficulty", 3)),
|
| 297 |
-
"realism": int(st.session_state.get(f"{prefix}realism", 3)),
|
| 298 |
-
"assistant_behavior": st.session_state.get(f"{prefix}assistant_behavior", LABEL_OPTIONS["assistant_behavior"][0]),
|
| 299 |
-
"multi_turn_escalation": bool(st.session_state.get(f"{prefix}multi_turn_escalation", False)),
|
| 300 |
-
"rule_followed": bool(st.session_state.get(f"{prefix}rule_followed", True)),
|
| 301 |
-
"needs_review": bool(st.session_state.get(f"{prefix}needs_review", False)),
|
| 302 |
-
"confidence": int(st.session_state.get(f"{prefix}confidence", 3)),
|
| 303 |
-
}
|
| 304 |
-
|
| 305 |
|
| 306 |
-
def preview_text(text: str, limit: int = 280) -> str:
|
| 307 |
-
txt = (text or "").strip().replace("\n", " ")
|
| 308 |
-
if len(txt) <= limit:
|
| 309 |
-
return txt
|
| 310 |
-
return txt[:limit - 1] + "…"
|
| 311 |
|
|
|
|
|
|
|
|
|
|
| 312 |
|
| 313 |
-
def
|
| 314 |
if not turns:
|
| 315 |
-
|
| 316 |
-
|
| 317 |
-
|
| 318 |
-
|
| 319 |
-
|
| 320 |
-
|
| 321 |
-
|
| 322 |
-
|
| 323 |
-
|
| 324 |
-
|
| 325 |
-
|
| 326 |
-
|
| 327 |
-
|
| 328 |
-
|
| 329 |
-
|
| 330 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 331 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 332 |
|
| 333 |
-
def annotation_exists_for_item(df_anns: pd.DataFrame, item_id: str, annotator: str) -> bool:
|
| 334 |
-
if df_anns.empty:
|
| 335 |
-
return False
|
| 336 |
-
sub = df_anns[(df_anns["item_id"] == item_id) & (df_anns["annotator"] == annotator)]
|
| 337 |
-
return not sub.empty
|
| 338 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 339 |
|
| 340 |
-
def compute_agreement(df_anns: pd.DataFrame, label_key: str = "assistant_behavior") -> Dict[str, Any]:
|
| 341 |
-
if df_anns.empty:
|
| 342 |
-
return {"paired_items": 0, "raw_agreement": None, "cohen_kappa": None}
|
| 343 |
|
| 344 |
-
|
| 345 |
-
|
| 346 |
-
|
| 347 |
-
|
| 348 |
-
|
| 349 |
-
|
| 350 |
-
|
| 351 |
-
|
| 352 |
-
|
| 353 |
-
|
| 354 |
-
|
| 355 |
-
|
| 356 |
-
|
| 357 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 358 |
return {
|
| 359 |
-
"
|
| 360 |
-
"
|
| 361 |
-
"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 362 |
}
|
| 363 |
|
| 364 |
|
| 365 |
-
def
|
| 366 |
-
|
| 367 |
-
file_rel_path = annotation_file_name(payload["item_id"], payload["annotator"])
|
| 368 |
-
local_path = LOCAL_DRAFT_DIR / file_rel_path.replace("/", "__")
|
| 369 |
-
local_path.parent.mkdir(parents=True, exist_ok=True)
|
| 370 |
-
with local_path.open("w", encoding="utf-8") as f:
|
| 371 |
-
json.dump(payload, f, ensure_ascii=False, indent=2)
|
| 372 |
|
| 373 |
-
api().upload_file(
|
| 374 |
-
path_or_fileobj=str(local_path),
|
| 375 |
-
path_in_repo=file_rel_path,
|
| 376 |
-
repo_id=annotation_repo_id,
|
| 377 |
-
repo_type="dataset",
|
| 378 |
-
token=token(),
|
| 379 |
-
commit_message=f"Add annotation for {payload['item_id']} by {payload['annotator']}",
|
| 380 |
-
)
|
| 381 |
-
return file_rel_path
|
| 382 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 383 |
|
| 384 |
-
|
| 385 |
-
|
| 386 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 387 |
|
| 388 |
-
def set_current_item_id(item_id: Optional[str]) -> None:
|
| 389 |
-
st.session_state["current_item_id"] = item_id
|
| 390 |
try:
|
| 391 |
-
|
| 392 |
-
|
| 393 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 394 |
|
| 395 |
|
| 396 |
-
|
| 397 |
-
|
| 398 |
-
|
| 399 |
-
"""
|
| 400 |
-
<style>
|
| 401 |
-
.block-container {padding-top: 1rem; padding-bottom: 2rem;}
|
| 402 |
-
.smallmono {font-size: 0.84rem; font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", monospace;}
|
| 403 |
-
.cardbox {
|
| 404 |
-
border: 1px solid rgba(120,120,120,0.22);
|
| 405 |
-
border-radius: 18px;
|
| 406 |
-
padding: 1rem 1rem 0.75rem 1rem;
|
| 407 |
-
background: rgba(255,255,255,0.03);
|
| 408 |
-
}
|
| 409 |
-
.turn {
|
| 410 |
-
border-left: 4px solid rgba(120,120,120,0.45);
|
| 411 |
-
padding: 0.6rem 0.85rem;
|
| 412 |
-
margin: 0.55rem 0;
|
| 413 |
-
border-radius: 0.6rem;
|
| 414 |
-
background: rgba(128,128,128,0.06);
|
| 415 |
-
}
|
| 416 |
-
.turn.user {border-left-color: #8b5cf6;}
|
| 417 |
-
.turn.assistant, .turn.bot {border-left-color: #06b6d4;}
|
| 418 |
-
.turn.system {border-left-color: #f59e0b;}
|
| 419 |
-
.badge {
|
| 420 |
-
display:inline-block; padding:0.18rem 0.5rem; border-radius: 999px;
|
| 421 |
-
background: rgba(120,120,120,0.16); margin-right: 0.35rem; font-size: 0.78rem;
|
| 422 |
-
}
|
| 423 |
-
hr {margin: 0.7rem 0 0.9rem 0;}
|
| 424 |
-
</style>
|
| 425 |
-
""",
|
| 426 |
-
unsafe_allow_html=True,
|
| 427 |
-
)
|
| 428 |
|
| 429 |
-
|
| 430 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 431 |
|
| 432 |
-
|
| 433 |
-
|
| 434 |
-
|
| 435 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 436 |
if "source_records" not in st.session_state:
|
| 437 |
st.session_state["source_records"] = None
|
| 438 |
-
if "source_index" not in st.session_state:
|
| 439 |
-
st.session_state["source_index"] = None
|
| 440 |
-
if "annotations_df" not in st.session_state:
|
| 441 |
-
st.session_state["annotations_df"] = None
|
| 442 |
-
if "draft_loaded" not in st.session_state:
|
| 443 |
-
st.session_state["draft_loaded"] = False
|
| 444 |
-
|
| 445 |
-
with st.sidebar:
|
| 446 |
-
st.header("Workspace")
|
| 447 |
-
annotator = st.text_input("Annotator name", value=st.session_state["annotator"])
|
| 448 |
-
st.session_state["annotator"] = annotator.strip() or "annotator_1"
|
| 449 |
-
|
| 450 |
-
source_repo = st.text_input("Source dataset repo", value=DEFAULT_SOURCE_DATASET)
|
| 451 |
-
source_split = st.text_input("Source split", value=DEFAULT_SOURCE_SPLIT)
|
| 452 |
-
annotation_repo = st.text_input("Annotation dataset repo", value=DEFAULT_ANNOTATION_REPO)
|
| 453 |
-
|
| 454 |
-
st.divider()
|
| 455 |
-
st.caption("HF token is needed only for upload / repo creation.")
|
| 456 |
-
st.write("HF token present:", "yes" if token() else "no")
|
| 457 |
-
st.write("Cache:", str(DEFAULT_CACHE_DIR))
|
| 458 |
-
st.write("Drafts:", str(LOCAL_DRAFT_DIR))
|
| 459 |
-
|
| 460 |
-
if st.button("Reload Hub data", use_container_width=True):
|
| 461 |
-
st.session_state["source_records"] = None
|
| 462 |
-
st.session_state["source_index"] = None
|
| 463 |
-
st.session_state["annotations_df"] = None
|
| 464 |
-
st.rerun()
|
| 465 |
-
|
| 466 |
-
page = st.radio("Page", ["Annotate", "Review", "Dashboard", "Export"], index=0)
|
| 467 |
|
| 468 |
if st.session_state["source_records"] is None:
|
| 469 |
-
with st.spinner("Loading source
|
| 470 |
-
source_records = load_source_dataset(source_repo, source_split)
|
| 471 |
-
samples_df, items_df = seed_source_index(source_records)
|
| 472 |
-
st.session_state["source_records"] = source_records
|
| 473 |
-
st.session_state["source_index"] = {"samples_df": samples_df, "items_df": items_df}
|
| 474 |
-
|
| 475 |
-
if st.session_state["annotations_df"] is None:
|
| 476 |
-
with st.spinner("Loading annotations from the annotation dataset repo..."):
|
| 477 |
try:
|
| 478 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 479 |
except Exception as e:
|
| 480 |
-
|
| 481 |
-
st.
|
| 482 |
-
|
| 483 |
-
|
| 484 |
-
|
| 485 |
-
|
| 486 |
-
|
| 487 |
-
|
| 488 |
-
|
| 489 |
-
|
| 490 |
-
|
| 491 |
-
|
| 492 |
-
|
| 493 |
-
|
| 494 |
-
|
| 495 |
-
|
| 496 |
-
|
| 497 |
-
|
| 498 |
-
|
| 499 |
-
|
| 500 |
-
|
| 501 |
-
|
| 502 |
-
|
| 503 |
-
|
| 504 |
-
|
| 505 |
-
|
| 506 |
-
|
| 507 |
-
|
| 508 |
-
|
| 509 |
-
|
| 510 |
-
|
| 511 |
-
|
| 512 |
-
|
| 513 |
-
|
| 514 |
-
|
| 515 |
-
|
| 516 |
-
|
| 517 |
-
|
| 518 |
-
|
| 519 |
-
|
| 520 |
-
|
| 521 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 522 |
left, right = st.columns([1.05, 0.95], gap="large")
|
| 523 |
|
| 524 |
with left:
|
| 525 |
-
|
| 526 |
-
with
|
| 527 |
-
if st.button("
|
| 528 |
-
|
| 529 |
-
|
| 530 |
-
|
| 531 |
-
|
| 532 |
-
set_current_item_id(q.iloc[0]["item_id"])
|
| 533 |
-
st.rerun()
|
| 534 |
-
with top_b:
|
| 535 |
-
if st.button("Reload annotations from Hub", use_container_width=True):
|
| 536 |
-
st.session_state["annotations_df"] = load_all_hub_annotations(annotation_repo)
|
| 537 |
st.rerun()
|
| 538 |
-
with
|
| 539 |
-
if st.button("
|
| 540 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 541 |
st.rerun()
|
| 542 |
|
| 543 |
-
|
| 544 |
-
|
| 545 |
-
|
| 546 |
-
|
| 547 |
-
|
| 548 |
-
|
| 549 |
-
|
| 550 |
-
st.write("Dataset columns:", list(q.columns))
|
| 551 |
-
|
| 552 |
-
if not q.empty:
|
| 553 |
-
|
| 554 |
-
# Only use columns that actually exist
|
| 555 |
-
available_cols = [
|
| 556 |
-
c for c in [
|
| 557 |
-
"item_id",
|
| 558 |
-
"sample_id",
|
| 559 |
-
"domain",
|
| 560 |
-
"scenario",
|
| 561 |
-
"distractor_index"
|
| 562 |
-
]
|
| 563 |
-
if c in q.columns
|
| 564 |
-
]
|
| 565 |
-
|
| 566 |
-
display = q[available_cols].copy()
|
| 567 |
-
|
| 568 |
-
if "distractor_text" in q.columns:
|
| 569 |
-
display["preview"] = q["distractor_text"].map(preview_text)
|
| 570 |
-
|
| 571 |
-
st.dataframe(display, use_container_width=True, hide_index=True)
|
| 572 |
-
|
| 573 |
-
return
|
| 574 |
-
|
| 575 |
-
st.markdown(
|
| 576 |
-
f"""
|
| 577 |
-
<div class="cardbox">
|
| 578 |
-
<div><span class="badge">Domain</span> {item.get("domain", "")}</div>
|
| 579 |
-
<div style="margin-top:0.35rem;"><span class="badge">Scenario</span> {item.get("scenario", "")}</div>
|
| 580 |
-
<div style="margin-top:0.35rem;"><span class="badge">Sample</span> <span class="smallmono">{item.get("sample_id", "")}</span></div>
|
| 581 |
-
<div style="margin-top:0.35rem;"><span class="badge">Item</span> <span class="smallmono">{item.get("item_id", "")}</span></div>
|
| 582 |
-
</div>
|
| 583 |
-
""",
|
| 584 |
-
unsafe_allow_html=True,
|
| 585 |
)
|
| 586 |
-
|
| 587 |
-
|
| 588 |
-
|
| 589 |
-
|
| 590 |
-
|
| 591 |
-
|
| 592 |
-
|
| 593 |
-
|
| 594 |
-
|
| 595 |
-
|
| 596 |
-
|
| 597 |
-
|
| 598 |
-
|
| 599 |
-
|
| 600 |
-
|
| 601 |
-
if existing.empty:
|
| 602 |
-
st.caption("No annotations yet.")
|
| 603 |
-
else:
|
| 604 |
-
for _, row in existing.iterrows():
|
| 605 |
-
st.write(f"**{row['annotator']}** · {row['status']} · {row['created_at']}")
|
| 606 |
-
st.json(row["labels"])
|
| 607 |
-
if row.get("notes"):
|
| 608 |
-
st.caption(row["notes"])
|
| 609 |
-
st.divider()
|
| 610 |
-
|
| 611 |
-
with right:
|
| 612 |
-
st.markdown("### Annotation form")
|
| 613 |
-
current_draft = load_draft(st.session_state["annotator"])
|
| 614 |
-
draft_labels = current_draft.get("labels", {}) if current_draft else {}
|
| 615 |
-
|
| 616 |
-
with st.form("annotation_form", clear_on_submit=False):
|
| 617 |
-
st.selectbox(
|
| 618 |
-
"Distractor kind",
|
| 619 |
-
LABEL_OPTIONS["distractor_kind"],
|
| 620 |
-
index=LABEL_OPTIONS["distractor_kind"].index(draft_labels.get("distractor_kind", LABEL_OPTIONS["distractor_kind"][0]))
|
| 621 |
-
if draft_labels.get("distractor_kind") in LABEL_OPTIONS["distractor_kind"]
|
| 622 |
-
else 0,
|
| 623 |
-
key="distractor_kind",
|
| 624 |
-
)
|
| 625 |
-
st.selectbox(
|
| 626 |
-
"Transition style",
|
| 627 |
-
LABEL_OPTIONS["transition_style"],
|
| 628 |
-
index=LABEL_OPTIONS["transition_style"].index(draft_labels.get("transition_style", LABEL_OPTIONS["transition_style"][0]))
|
| 629 |
-
if draft_labels.get("transition_style") in LABEL_OPTIONS["transition_style"]
|
| 630 |
-
else 0,
|
| 631 |
-
key="transition_style",
|
| 632 |
-
)
|
| 633 |
-
st.multiselect(
|
| 634 |
-
"Policy target(s)",
|
| 635 |
-
LABEL_OPTIONS["policy_target"],
|
| 636 |
-
default=draft_labels.get("policy_target", []),
|
| 637 |
-
key="policy_target",
|
| 638 |
-
)
|
| 639 |
-
c1, c2 = st.columns(2)
|
| 640 |
-
with c1:
|
| 641 |
-
st.slider("Difficulty", 1, 5, value=int(draft_labels.get("difficulty", 3)), key="difficulty")
|
| 642 |
-
st.slider("Realism", 1, 5, value=int(draft_labels.get("realism", 3)), key="realism")
|
| 643 |
-
with c2:
|
| 644 |
-
st.selectbox(
|
| 645 |
-
"Assistant behavior",
|
| 646 |
-
LABEL_OPTIONS["assistant_behavior"],
|
| 647 |
-
index=LABEL_OPTIONS["assistant_behavior"].index(draft_labels.get("assistant_behavior", LABEL_OPTIONS["assistant_behavior"][0]))
|
| 648 |
-
if draft_labels.get("assistant_behavior") in LABEL_OPTIONS["assistant_behavior"]
|
| 649 |
-
else 0,
|
| 650 |
-
key="assistant_behavior",
|
| 651 |
)
|
| 652 |
-
st.
|
|
|
|
| 653 |
|
| 654 |
-
|
| 655 |
-
|
| 656 |
-
|
| 657 |
-
|
| 658 |
-
)
|
| 659 |
-
|
| 660 |
-
|
| 661 |
-
|
| 662 |
-
|
| 663 |
-
|
| 664 |
-
|
| 665 |
-
|
| 666 |
-
|
| 667 |
-
|
| 668 |
-
|
| 669 |
-
|
| 670 |
-
|
| 671 |
-
|
| 672 |
-
|
| 673 |
-
|
| 674 |
-
|
| 675 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 676 |
|
| 677 |
-
|
|
|
|
| 678 |
with c1:
|
| 679 |
-
if st.button("
|
| 680 |
-
|
| 681 |
-
|
| 682 |
-
"
|
| 683 |
-
"
|
| 684 |
-
"
|
| 685 |
-
|
| 686 |
-
|
| 687 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 688 |
with c2:
|
| 689 |
-
|
| 690 |
-
|
| 691 |
-
|
| 692 |
-
|
| 693 |
-
|
| 694 |
-
|
| 695 |
-
|
| 696 |
-
|
| 697 |
-
|
| 698 |
-
|
| 699 |
-
"
|
| 700 |
-
"
|
| 701 |
-
"
|
| 702 |
-
"
|
| 703 |
-
"
|
| 704 |
-
"
|
| 705 |
-
|
| 706 |
-
|
| 707 |
-
|
| 708 |
-
|
| 709 |
-
|
| 710 |
-
|
| 711 |
-
|
|
|
|
|
|
|
| 712 |
try:
|
| 713 |
-
|
| 714 |
-
|
| 715 |
-
[
|
| 716 |
-
|
| 717 |
-
|
| 718 |
-
|
| 719 |
-
|
| 720 |
-
|
| 721 |
-
|
| 722 |
-
|
| 723 |
-
|
| 724 |
-
"notes": payload["notes"],
|
| 725 |
-
"status": payload["status"],
|
| 726 |
-
"created_at": payload["created_at"],
|
| 727 |
-
"file_path": path_in_repo,
|
| 728 |
-
}
|
| 729 |
-
]
|
| 730 |
-
),
|
| 731 |
-
],
|
| 732 |
-
ignore_index=True,
|
| 733 |
-
)
|
| 734 |
-
save_draft(
|
| 735 |
-
st.session_state["annotator"],
|
| 736 |
-
{
|
| 737 |
-
"current_item_id": item["item_id"],
|
| 738 |
-
"labels": labels,
|
| 739 |
-
"notes": notes,
|
| 740 |
-
"saved_at": now_iso(),
|
| 741 |
-
},
|
| 742 |
-
)
|
| 743 |
-
st.success(f"Submitted to Hugging Face as {path_in_repo}")
|
| 744 |
-
q = queue_df()
|
| 745 |
-
if not q.empty:
|
| 746 |
-
set_current_item_id(q.iloc[0]["item_id"])
|
| 747 |
-
st.rerun()
|
| 748 |
except Exception as e:
|
| 749 |
-
st.error(f"
|
| 750 |
-
|
| 751 |
-
st.session_state["annotator"],
|
| 752 |
-
{
|
| 753 |
-
"current_item_id": item["item_id"],
|
| 754 |
-
"labels": labels,
|
| 755 |
-
"notes": notes,
|
| 756 |
-
"saved_at": now_iso(),
|
| 757 |
-
},
|
| 758 |
-
)
|
| 759 |
-
|
| 760 |
-
st.caption("Each submission is a separate file in the annotation dataset repo, so multiple annotators can work in parallel without write conflicts.")
|
| 761 |
|
| 762 |
-
|
| 763 |
-
|
| 764 |
-
|
| 765 |
-
|
| 766 |
-
|
| 767 |
-
|
| 768 |
-
|
| 769 |
-
|
| 770 |
-
|
| 771 |
-
|
| 772 |
-
|
| 773 |
-
|
| 774 |
-
|
| 775 |
-
|
| 776 |
-
|
| 777 |
-
|
| 778 |
-
|
| 779 |
-
|
| 780 |
-
st.
|
| 781 |
-
|
| 782 |
-
|
| 783 |
-
|
| 784 |
-
|
| 785 |
-
|
| 786 |
-
|
| 787 |
-
|
| 788 |
-
with cols[idx % len(cols)]:
|
| 789 |
-
st.write(f"**{ann['annotator']}**")
|
| 790 |
-
st.caption(f"{ann['status']} · {ann['created_at']}")
|
| 791 |
-
st.json(ann["labels"])
|
| 792 |
-
if ann.get("notes"):
|
| 793 |
-
st.caption(ann["notes"])
|
| 794 |
-
|
| 795 |
-
agreement = compute_agreement(sub, label_key="assistant_behavior")
|
| 796 |
-
c1, c2, c3 = st.columns(3)
|
| 797 |
-
c1.metric("Paired items", agreement["paired_items"])
|
| 798 |
-
c2.metric("Raw agreement", f"{agreement['raw_agreement']:.2%}" if agreement["raw_agreement"] is not None else "n/a")
|
| 799 |
-
c3.metric("Cohen's κ", f"{agreement['cohen_kappa']:.3f}" if agreement["cohen_kappa"] is not None else "n/a")
|
| 800 |
-
|
| 801 |
-
elif page == "Dashboard":
|
| 802 |
-
st.subheader("Dashboard")
|
| 803 |
-
c1, c2, c3, c4 = st.columns(4)
|
| 804 |
-
c1.metric("Source samples", len(samples_df))
|
| 805 |
-
c2.metric("Source items", len(items_df))
|
| 806 |
-
c3.metric("Annotation files", len(anns_df))
|
| 807 |
-
c4.metric("My queue", len(queue_df()))
|
| 808 |
-
|
| 809 |
-
st.markdown("### Progress by annotator")
|
| 810 |
-
if anns_df.empty:
|
| 811 |
-
st.info("No annotations yet.")
|
| 812 |
-
else:
|
| 813 |
-
by_ann = anns_df.groupby("annotator")["item_id"].nunique().reset_index(name="annotated_items").sort_values("annotated_items", ascending=False)
|
| 814 |
-
st.dataframe(by_ann, use_container_width=True, hide_index=True)
|
| 815 |
-
|
| 816 |
-
st.markdown("### Progress by domain")
|
| 817 |
-
joined = anns_df.merge(items_df[["item_id", "domain"]], on="item_id", how="left")
|
| 818 |
-
by_domain = joined.groupby("domain")["item_id"].nunique().reset_index(name="annotated_items").sort_values("annotated_items", ascending=False)
|
| 819 |
-
st.dataframe(by_domain, use_container_width=True, hide_index=True)
|
| 820 |
-
|
| 821 |
-
st.markdown("### Agreement snapshot")
|
| 822 |
-
metric = compute_agreement(anns_df, label_key="assistant_behavior")
|
| 823 |
-
st.write(metric)
|
| 824 |
-
|
| 825 |
-
st.markdown("### Recent annotation previews")
|
| 826 |
-
recent = anns_df.sort_values("created_at", ascending=False).head(20).copy()
|
| 827 |
-
if "labels" in recent.columns:
|
| 828 |
-
recent["assistant_behavior"] = recent["labels"].apply(lambda x: x.get("assistant_behavior") if isinstance(x, dict) else None)
|
| 829 |
-
recent["distractor_kind"] = recent["labels"].apply(lambda x: x.get("distractor_kind") if isinstance(x, dict) else None)
|
| 830 |
-
st.dataframe(
|
| 831 |
-
recent[["annotator", "item_id", "status", "created_at", "assistant_behavior", "distractor_kind", "notes"]],
|
| 832 |
-
use_container_width=True,
|
| 833 |
-
hide_index=True,
|
| 834 |
-
)
|
| 835 |
|
| 836 |
else:
|
| 837 |
-
st.subheader("Export")
|
| 838 |
-
st.write("Export
|
|
|
|
| 839 |
|
| 840 |
-
merged
|
| 841 |
-
|
| 842 |
-
|
| 843 |
-
|
| 844 |
-
|
| 845 |
-
|
| 846 |
-
export_df["labels"] = None
|
| 847 |
-
export_df["notes"] = None
|
| 848 |
-
export_df["status"] = None
|
| 849 |
-
export_df["created_at"] = None
|
| 850 |
-
|
| 851 |
-
c1, c2 = st.columns(2)
|
| 852 |
with c1:
|
| 853 |
-
|
| 854 |
-
|
| 855 |
-
with
|
| 856 |
-
for
|
| 857 |
-
f.write(json.dumps(r
|
| 858 |
-
st.success(f"Wrote {
|
| 859 |
-
st.download_button("Download JSONL",
|
| 860 |
with c2:
|
| 861 |
-
|
| 862 |
-
|
| 863 |
-
|
| 864 |
-
st.success(f"Wrote {
|
| 865 |
-
st.download_button("Download CSV",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 866 |
|
| 867 |
-
st.markdown("### Repository
|
| 868 |
st.code(
|
| 869 |
-
f"
|
|
|
|
|
|
|
|
|
|
| 870 |
language="text",
|
| 871 |
)
|
| 872 |
|
|
|
|
| 3 |
import json
|
| 4 |
import os
|
| 5 |
import uuid
|
| 6 |
+
from dataclasses import dataclass
|
| 7 |
from datetime import datetime, timezone
|
| 8 |
from pathlib import Path
|
| 9 |
from typing import Any, Dict, List, Optional, Tuple
|
|
|
|
| 13 |
from datasets import load_dataset
|
| 14 |
from huggingface_hub import HfApi, hf_hub_download
|
| 15 |
|
| 16 |
+
try:
|
| 17 |
+
from openai import OpenAI
|
| 18 |
+
except Exception: # optional
|
| 19 |
+
OpenAI = None
|
| 20 |
+
|
| 21 |
APP_TITLE = "🧭 LLM Annotation Platform"
|
| 22 |
+
DEFAULT_SOURCE_REPO = os.environ.get("SOURCE_DATASET_REPO", "nvidia/CantTalkAboutThis-Topic-Control-Dataset")
|
| 23 |
+
DEFAULT_SOURCE_SPLITS = os.environ.get("SOURCE_DATASET_SPLITS", "train,test")
|
| 24 |
+
DEFAULT_ANNOTATION_REPO = os.environ.get("ANNOTATION_REPO_ID", "YOUR_USERNAME/llm-distractor-annotations")
|
| 25 |
+
DEFAULT_HF_TOKEN = os.environ.get("HF_TOKEN", os.environ.get("HUGGINGFACE_HUB_TOKEN", ""))
|
| 26 |
+
|
| 27 |
+
CACHE_DIR = Path(os.environ.get("CACHE_DIR", "/data/cache"))
|
| 28 |
+
DRAFT_DIR = Path(os.environ.get("DRAFT_DIR", "/data/drafts"))
|
| 29 |
+
EXPORT_DIR = Path(os.environ.get("EXPORT_DIR", "/data/exports"))
|
| 30 |
+
CACHE_DIR.mkdir(parents=True, exist_ok=True)
|
| 31 |
+
DRAFT_DIR.mkdir(parents=True, exist_ok=True)
|
| 32 |
+
EXPORT_DIR.mkdir(parents=True, exist_ok=True)
|
| 33 |
+
|
| 34 |
+
TACTICS = [
|
| 35 |
+
"rephrasing the request",
|
| 36 |
+
"claiming urgency",
|
| 37 |
+
"just a general explanation",
|
| 38 |
+
"flattery + redirect",
|
| 39 |
+
"hypothetical loophole",
|
| 40 |
+
"persisting after refusal",
|
| 41 |
+
"false framing",
|
| 42 |
+
"other",
|
| 43 |
+
]
|
| 44 |
+
|
| 45 |
+
TURN_ROLES = ["user", "assistant", "system", "tool"]
|
| 46 |
+
|
| 47 |
+
DEFAULT_OUTPUT_TEMPLATE = {
|
| 48 |
+
"domain": "",
|
| 49 |
+
"scenario": "",
|
| 50 |
+
"system_instruction": "",
|
| 51 |
+
"conversation": [],
|
| 52 |
+
"distractors": [],
|
| 53 |
+
"distractors_multiturn": [],
|
| 54 |
+
"conversation_with_distractors": [],
|
| 55 |
+
"split": "train",
|
| 56 |
+
"_review_status": "draft",
|
| 57 |
+
"_needs_human_review": True,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
}
|
| 59 |
|
| 60 |
+
# ---------------------------------------------------------
|
| 61 |
+
# Small utilities
|
| 62 |
+
# ---------------------------------------------------------
|
| 63 |
|
| 64 |
def now_iso() -> str:
|
| 65 |
+
return datetime.now(timezone.utc).isoformat(timespec="seconds")
|
| 66 |
|
| 67 |
|
| 68 |
+
def slugify(text: str, default: str = "item") -> str:
|
| 69 |
+
text = (text or "").strip().lower()
|
| 70 |
+
text = re.sub(r"[^a-z0-9]+", "-", text)
|
| 71 |
+
text = text.strip("-")
|
| 72 |
+
return text or default
|
| 73 |
|
| 74 |
|
| 75 |
+
def safe_json_loads(value: str, fallback: Any) -> Any:
|
| 76 |
+
try:
|
| 77 |
+
return json.loads(value)
|
| 78 |
+
except Exception:
|
| 79 |
+
return fallback
|
| 80 |
|
| 81 |
|
| 82 |
+
def pretty_json(value: Any) -> str:
|
| 83 |
+
return json.dumps(value, ensure_ascii=False, indent=2)
|
|
|
|
|
|
|
|
|
|
| 84 |
|
| 85 |
|
| 86 |
+
def row_to_dict(row: Any) -> Dict[str, Any]:
|
| 87 |
+
if isinstance(row, pd.Series):
|
| 88 |
+
return row.to_dict()
|
| 89 |
+
if isinstance(row, dict):
|
| 90 |
+
return dict(row)
|
| 91 |
+
return dict(row)
|
| 92 |
|
| 93 |
|
| 94 |
+
def series_get(record: Dict[str, Any], *keys: str, default: Any = "") -> Any:
|
| 95 |
+
for key in keys:
|
| 96 |
+
if key in record and record[key] not in (None, ""):
|
| 97 |
+
return record[key]
|
| 98 |
+
return default
|
| 99 |
|
| 100 |
|
| 101 |
+
def ensure_list_of_dicts(value: Any) -> List[Dict[str, Any]]:
|
| 102 |
+
if value is None:
|
| 103 |
+
return []
|
| 104 |
+
if isinstance(value, str):
|
| 105 |
+
value = safe_json_loads(value, [])
|
| 106 |
+
if not isinstance(value, list):
|
| 107 |
+
return []
|
| 108 |
+
out = []
|
| 109 |
+
for item in value:
|
| 110 |
+
if isinstance(item, dict):
|
| 111 |
+
out.append(item)
|
| 112 |
+
else:
|
| 113 |
+
out.append({"value": str(item)})
|
| 114 |
+
return out
|
| 115 |
|
| 116 |
|
| 117 |
+
def ensure_turns(value: Any) -> List[Dict[str, str]]:
|
| 118 |
+
turns = ensure_list_of_dicts(value)
|
| 119 |
+
out = []
|
| 120 |
+
for t in turns:
|
| 121 |
+
out.append({
|
| 122 |
+
"role": str(t.get("role", "user")),
|
| 123 |
+
"content": str(t.get("content", t.get("text", ""))),
|
| 124 |
+
})
|
| 125 |
+
return out
|
| 126 |
|
| 127 |
|
| 128 |
+
def normalize_conversation(raw: Any) -> List[Dict[str, str]]:
|
| 129 |
+
return ensure_turns(raw)
|
| 130 |
+
|
| 131 |
+
|
| 132 |
+
def normalize_distractors(raw: Any) -> List[Dict[str, str]]:
|
| 133 |
+
items = ensure_list_of_dicts(raw)
|
|
|
|
|
|
|
|
|
|
|
|
|
| 134 |
out = []
|
| 135 |
+
for d in items:
|
| 136 |
+
out.append({
|
| 137 |
+
"bot_turn": str(d.get("bot_turn", d.get("bot turn", ""))),
|
| 138 |
+
"distractor": str(d.get("distractor", d.get("user_turn", d.get("content", "")))),
|
| 139 |
+
})
|
|
|
|
|
|
|
| 140 |
return out
|
| 141 |
|
| 142 |
|
| 143 |
+
def normalize_multiturn(raw: Any) -> List[Dict[str, Any]]:
|
| 144 |
+
items = ensure_list_of_dicts(raw)
|
| 145 |
+
out = []
|
| 146 |
+
for d in items:
|
| 147 |
+
turns = d.get("turns", [])
|
| 148 |
+
if isinstance(turns, str):
|
| 149 |
+
turns = safe_json_loads(turns, [])
|
| 150 |
+
out.append({
|
| 151 |
+
"off_topic_subject": str(d.get("off_topic_subject", "")),
|
| 152 |
+
"tactic_used": str(d.get("tactic_used", "")),
|
| 153 |
+
"bot_turn": str(d.get("bot_turn", d.get("bot turn", ""))),
|
| 154 |
+
"turns_json": pretty_json(ensure_turns(turns)) if turns else "[]",
|
| 155 |
+
})
|
| 156 |
+
return out
|
| 157 |
|
| 158 |
|
| 159 |
+
def build_conversation_with_distractors(conversation: List[Dict[str, str]], multiturn: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
| 160 |
+
"""
|
| 161 |
+
Simple automatic build:
|
| 162 |
+
- keep the base conversation as the first item
|
| 163 |
+
- add a variant conversation for each multiturn distractor by appending the user turns
|
| 164 |
+
after the matching bot_turn when possible.
|
| 165 |
+
"""
|
| 166 |
+
if not conversation:
|
| 167 |
+
return []
|
| 168 |
+
|
| 169 |
+
variants = [{"variant": "base", "conversation": conversation}]
|
| 170 |
+
for idx, d in enumerate(multiturn):
|
| 171 |
+
turns = safe_json_loads(d.get("turns_json", "[]"), [])
|
| 172 |
+
if not isinstance(turns, list):
|
| 173 |
+
turns = []
|
| 174 |
+
bot_turn = str(d.get("bot_turn", "")).strip()
|
| 175 |
+
|
| 176 |
+
conv = []
|
| 177 |
+
inserted = False
|
| 178 |
+
for turn in conversation:
|
| 179 |
+
conv.append(turn)
|
| 180 |
+
if not inserted and bot_turn and turn.get("role", "").lower() == "assistant" and turn.get("content", "").strip() == bot_turn:
|
| 181 |
+
conv.extend(ensure_turns(turns))
|
| 182 |
+
inserted = True
|
| 183 |
+
if not inserted:
|
| 184 |
+
# Fallback: append to end
|
| 185 |
+
conv.extend(ensure_turns(turns))
|
| 186 |
+
variants.append({
|
| 187 |
+
"variant": f"distractor_{idx+1}",
|
| 188 |
+
"conversation": conv,
|
| 189 |
+
})
|
| 190 |
+
return variants
|
| 191 |
+
|
| 192 |
+
|
| 193 |
+
def record_from_inputs(
|
| 194 |
+
domain: str,
|
| 195 |
+
scenario: str,
|
| 196 |
+
system_instruction: str,
|
| 197 |
+
conversation: List[Dict[str, str]],
|
| 198 |
+
distractors: List[Dict[str, str]],
|
| 199 |
+
multiturn: List[Dict[str, Any]],
|
| 200 |
+
conversation_with_distractors: Any,
|
| 201 |
+
split: str,
|
| 202 |
+
review_status: str,
|
| 203 |
+
needs_review: bool,
|
| 204 |
+
source_split: str = "",
|
| 205 |
+
source_index: Optional[int] = None,
|
| 206 |
+
source_repo: str = "",
|
| 207 |
+
annotator: str = "",
|
| 208 |
+
) -> Dict[str, Any]:
|
| 209 |
+
record = {
|
| 210 |
+
"domain": domain.strip(),
|
| 211 |
+
"scenario": scenario.strip(),
|
| 212 |
+
"system_instruction": system_instruction.strip(),
|
| 213 |
+
"conversation": conversation,
|
| 214 |
+
"distractors": distractors,
|
| 215 |
+
"distractors_multiturn": multiturn,
|
| 216 |
+
"conversation_with_distractors": conversation_with_distractors,
|
| 217 |
+
"split": split,
|
| 218 |
+
"_review_status": review_status,
|
| 219 |
+
"_needs_human_review": needs_review,
|
| 220 |
+
"_annotator": annotator,
|
| 221 |
+
"_source_repo": source_repo,
|
| 222 |
+
"_source_split": source_split,
|
| 223 |
+
"_source_index": source_index,
|
| 224 |
+
"_created_at": now_iso(),
|
| 225 |
+
"_updated_at": now_iso(),
|
| 226 |
}
|
| 227 |
+
return record
|
| 228 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 229 |
|
| 230 |
+
def record_to_exportable(record: Dict[str, Any]) -> Dict[str, Any]:
|
| 231 |
+
out = dict(record)
|
| 232 |
+
# keep the same top-level structure as the source file, but preserve provenance
|
| 233 |
+
return out
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 234 |
|
| 235 |
|
| 236 |
+
# ---------------------------------------------------------
|
| 237 |
+
# Data loading
|
| 238 |
+
# ---------------------------------------------------------
|
| 239 |
+
|
| 240 |
+
@st.cache_data(show_spinner=False)
|
| 241 |
+
def load_hf_split(repo_id: str, split: str) -> List[Dict[str, Any]]:
|
| 242 |
+
ds = load_dataset(repo_id, split=split)
|
| 243 |
+
return [dict(r) for r in ds]
|
| 244 |
|
| 245 |
|
| 246 |
+
@st.cache_data(show_spinner=False)
|
| 247 |
+
def load_hf_all_splits(repo_id: str, splits_csv: str) -> List[Dict[str, Any]]:
|
| 248 |
+
all_rows: List[Dict[str, Any]] = []
|
| 249 |
+
for split in [s.strip() for s in splits_csv.split(",") if s.strip()]:
|
| 250 |
+
try:
|
| 251 |
+
rows = load_hf_split(repo_id, split)
|
| 252 |
+
for i, row in enumerate(rows):
|
| 253 |
+
row = dict(row)
|
| 254 |
+
row.setdefault("split", split)
|
| 255 |
+
row.setdefault("_source_split", split)
|
| 256 |
+
row.setdefault("_source_index", i)
|
| 257 |
+
row.setdefault("_source_repo", repo_id)
|
| 258 |
+
all_rows.append(row)
|
| 259 |
+
except Exception:
|
| 260 |
+
continue
|
| 261 |
+
return all_rows
|
| 262 |
+
|
| 263 |
+
|
| 264 |
+
def load_local_json(path: Path, split_default: str = "train") -> List[Dict[str, Any]]:
|
| 265 |
+
if path.suffix.lower() == ".jsonl":
|
| 266 |
+
rows = []
|
| 267 |
+
with path.open("r", encoding="utf-8") as f:
|
| 268 |
+
for line in f:
|
| 269 |
+
if line.strip():
|
| 270 |
+
rows.append(json.loads(line))
|
| 271 |
+
for i, row in enumerate(rows):
|
| 272 |
+
row.setdefault("split", split_default)
|
| 273 |
+
row.setdefault("_source_split", split_default)
|
| 274 |
+
row.setdefault("_source_index", i)
|
| 275 |
+
return rows
|
| 276 |
with path.open("r", encoding="utf-8") as f:
|
| 277 |
+
data = json.load(f)
|
| 278 |
+
if not isinstance(data, list):
|
| 279 |
+
raise ValueError("Local JSON must contain a list of records.")
|
| 280 |
+
for i, row in enumerate(data):
|
| 281 |
+
row.setdefault("split", split_default)
|
| 282 |
+
row.setdefault("_source_split", split_default)
|
| 283 |
+
row.setdefault("_source_index", i)
|
| 284 |
+
return data
|
| 285 |
|
| 286 |
|
| 287 |
+
def coerce_source_records(raw_records: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
| 288 |
+
out = []
|
| 289 |
+
for i, r in enumerate(raw_records):
|
| 290 |
+
rec = dict(r)
|
| 291 |
+
rec.setdefault("split", rec.get("_source_split", "train"))
|
| 292 |
+
rec.setdefault("_source_index", i)
|
| 293 |
+
out.append(rec)
|
| 294 |
+
return out
|
| 295 |
|
|
|
|
|
|
|
|
|
|
| 296 |
|
| 297 |
+
# ---------------------------------------------------------
|
| 298 |
+
# HF persistence
|
| 299 |
+
# ---------------------------------------------------------
|
| 300 |
+
|
| 301 |
+
def hf_client() -> HfApi:
|
| 302 |
+
return HfApi(token=DEFAULT_HF_TOKEN or None)
|
| 303 |
+
|
| 304 |
+
|
| 305 |
+
def ensure_annotation_repo(repo_id: str) -> None:
|
| 306 |
+
if not repo_id or repo_id.startswith("YOUR_"):
|
| 307 |
+
return
|
| 308 |
+
hf_client().create_repo(repo_id=repo_id, repo_type="dataset", private=True, exist_ok=True)
|
| 309 |
+
|
| 310 |
+
|
| 311 |
+
def upload_record_to_hf(repo_id: str, record: Dict[str, Any], annotator: str) -> str:
|
| 312 |
+
ensure_annotation_repo(repo_id)
|
| 313 |
+
stamp = datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ")
|
| 314 |
+
safe_name = slugify(f"{annotator}-{record.get('domain','')}-{record.get('scenario','')}", "entry")
|
| 315 |
+
filename = f"entries/{slugify(annotator, 'annotator')}/{stamp}_{safe_name}_{uuid.uuid4().hex[:8]}.json"
|
| 316 |
+
|
| 317 |
+
tmp_dir = DRAFT_DIR / "_tmp_uploads"
|
| 318 |
+
tmp_dir.mkdir(parents=True, exist_ok=True)
|
| 319 |
+
tmp_file = tmp_dir / f"{uuid.uuid4().hex}.json"
|
| 320 |
+
with tmp_file.open("w", encoding="utf-8") as f:
|
| 321 |
+
json.dump(record_to_exportable(record), f, ensure_ascii=False, indent=2)
|
| 322 |
+
|
| 323 |
+
hf_client().upload_file(
|
| 324 |
+
path_or_fileobj=str(tmp_file),
|
| 325 |
+
path_in_repo=filename,
|
| 326 |
+
repo_id=repo_id,
|
| 327 |
+
repo_type="dataset",
|
| 328 |
+
commit_message=f"Add annotation entry by {annotator}",
|
| 329 |
+
)
|
| 330 |
+
return filename
|
| 331 |
+
|
| 332 |
+
|
| 333 |
+
def list_uploaded_files(repo_id: str) -> List[str]:
|
| 334 |
+
if not repo_id or repo_id.startswith("YOUR_"):
|
| 335 |
+
return []
|
| 336 |
+
try:
|
| 337 |
+
return hf_client().list_repo_files(repo_id, repo_type="dataset")
|
| 338 |
+
except Exception:
|
| 339 |
+
return []
|
| 340 |
|
|
|
|
| 341 |
|
| 342 |
+
# ---------------------------------------------------------
|
| 343 |
+
# Local drafts / state
|
| 344 |
+
# ---------------------------------------------------------
|
| 345 |
|
| 346 |
+
def annotator_draft_path(annotator: str) -> Path:
|
| 347 |
+
safe = slugify(annotator, "annotator")
|
| 348 |
+
return DRAFT_DIR / f"{safe}.json"
|
| 349 |
+
|
| 350 |
+
|
| 351 |
+
def save_draft_local(annotator: str, payload: Dict[str, Any]) -> Path:
|
| 352 |
+
path = annotator_draft_path(annotator)
|
| 353 |
with path.open("w", encoding="utf-8") as f:
|
| 354 |
json.dump(payload, f, ensure_ascii=False, indent=2)
|
| 355 |
return path
|
| 356 |
|
| 357 |
|
| 358 |
+
def load_draft_local(annotator: str) -> Dict[str, Any]:
|
| 359 |
+
path = annotator_draft_path(annotator)
|
| 360 |
if not path.exists():
|
| 361 |
return {}
|
| 362 |
try:
|
| 363 |
+
with path.open("r", encoding="utf-8") as f:
|
| 364 |
+
return json.load(f)
|
| 365 |
except Exception:
|
| 366 |
return {}
|
| 367 |
|
| 368 |
|
| 369 |
+
def append_submission_index(entry: Dict[str, Any]) -> None:
|
| 370 |
+
idx = DRAFT_DIR / "submissions_index.jsonl"
|
| 371 |
+
with idx.open("a", encoding="utf-8") as f:
|
| 372 |
+
f.write(json.dumps(entry, ensure_ascii=False) + "\n")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 373 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 374 |
|
| 375 |
+
# ---------------------------------------------------------
|
| 376 |
+
# Editing helpers
|
| 377 |
+
# ---------------------------------------------------------
|
| 378 |
|
| 379 |
+
def df_from_turns(turns: List[Dict[str, str]]) -> pd.DataFrame:
|
| 380 |
if not turns:
|
| 381 |
+
return pd.DataFrame([{"role": "user", "content": ""}])
|
| 382 |
+
return pd.DataFrame(turns)
|
| 383 |
+
|
| 384 |
+
|
| 385 |
+
def turns_from_df(df: pd.DataFrame) -> List[Dict[str, str]]:
|
| 386 |
+
if df is None or df.empty:
|
| 387 |
+
return []
|
| 388 |
+
out = []
|
| 389 |
+
for _, row in df.iterrows():
|
| 390 |
+
role = str(row.get("role", "")).strip()
|
| 391 |
+
content = str(row.get("content", "")).strip()
|
| 392 |
+
if role or content:
|
| 393 |
+
out.append({"role": role or "user", "content": content})
|
| 394 |
+
return out
|
| 395 |
+
|
| 396 |
+
|
| 397 |
+
def df_from_simple_distractors(items: List[Dict[str, str]]) -> pd.DataFrame:
|
| 398 |
+
if not items:
|
| 399 |
+
return pd.DataFrame([{"bot_turn": "", "distractor": ""}])
|
| 400 |
+
return pd.DataFrame(items)
|
| 401 |
+
|
| 402 |
|
| 403 |
+
def simple_distractors_from_df(df: pd.DataFrame) -> List[Dict[str, str]]:
|
| 404 |
+
if df is None or df.empty:
|
| 405 |
+
return []
|
| 406 |
+
out = []
|
| 407 |
+
for _, row in df.iterrows():
|
| 408 |
+
bot_turn = str(row.get("bot_turn", "")).strip()
|
| 409 |
+
distractor = str(row.get("distractor", "")).strip()
|
| 410 |
+
if bot_turn or distractor:
|
| 411 |
+
out.append({"bot_turn": bot_turn, "distractor": distractor})
|
| 412 |
+
return out
|
| 413 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 414 |
|
| 415 |
+
def df_from_multiturn(items: List[Dict[str, Any]]) -> pd.DataFrame:
|
| 416 |
+
if not items:
|
| 417 |
+
return pd.DataFrame([{"off_topic_subject": "", "tactic_used": TACTICS[0], "bot_turn": "", "turns_json": "[]"}])
|
| 418 |
+
return pd.DataFrame(items)
|
| 419 |
|
|
|
|
|
|
|
|
|
|
| 420 |
|
| 421 |
+
def multiturn_from_df(df: pd.DataFrame) -> List[Dict[str, Any]]:
|
| 422 |
+
if df is None or df.empty:
|
| 423 |
+
return []
|
| 424 |
+
out = []
|
| 425 |
+
for _, row in df.iterrows():
|
| 426 |
+
subject = str(row.get("off_topic_subject", "")).strip()
|
| 427 |
+
tactic = str(row.get("tactic_used", "")).strip()
|
| 428 |
+
bot_turn = str(row.get("bot_turn", "")).strip()
|
| 429 |
+
turns_json = str(row.get("turns_json", "[]")).strip()
|
| 430 |
+
turns = safe_json_loads(turns_json, [])
|
| 431 |
+
if isinstance(turns, list):
|
| 432 |
+
turns = ensure_turns(turns)
|
| 433 |
+
else:
|
| 434 |
+
turns = []
|
| 435 |
+
if subject or bot_turn or turns:
|
| 436 |
+
out.append({
|
| 437 |
+
"off_topic_subject": subject,
|
| 438 |
+
"tactic_used": tactic,
|
| 439 |
+
"bot_turn": bot_turn,
|
| 440 |
+
"turns_json": pretty_json(turns) if turns else "[]",
|
| 441 |
+
})
|
| 442 |
+
return out
|
| 443 |
+
|
| 444 |
+
|
| 445 |
+
def normalize_draft_from_record(record: Dict[str, Any], source_repo: str = "", source_split: str = "", source_index: Optional[int] = None) -> Dict[str, Any]:
|
| 446 |
+
conversation = normalize_conversation(record.get("conversation"))
|
| 447 |
+
distractors = normalize_distractors(record.get("distractors"))
|
| 448 |
+
multiturn = normalize_multiturn(record.get("distractors_multiturn"))
|
| 449 |
+
convwd = record.get("conversation_with_distractors", [])
|
| 450 |
+
if not isinstance(convwd, list):
|
| 451 |
+
convwd = []
|
| 452 |
+
if not convwd and multiturn:
|
| 453 |
+
convwd = build_conversation_with_distractors(conversation, multiturn)
|
| 454 |
return {
|
| 455 |
+
"domain": str(series_get(record, "domain", default="")),
|
| 456 |
+
"scenario": str(series_get(record, "scenario", default="")),
|
| 457 |
+
"system_instruction": str(series_get(record, "system_instruction", default="")),
|
| 458 |
+
"conversation": conversation,
|
| 459 |
+
"distractors": distractors,
|
| 460 |
+
"distractors_multiturn": multiturn,
|
| 461 |
+
"conversation_with_distractors": convwd,
|
| 462 |
+
"split": str(series_get(record, "split", "_source_split", default="train")),
|
| 463 |
+
"_review_status": str(series_get(record, "_review_status", default="draft")),
|
| 464 |
+
"_needs_human_review": bool(record.get("_needs_human_review", True)),
|
| 465 |
+
"_source_repo": source_repo or str(series_get(record, "_source_repo", default="")),
|
| 466 |
+
"_source_split": source_split or str(series_get(record, "_source_split", default="")),
|
| 467 |
+
"_source_index": source_index if source_index is not None else record.get("_source_index"),
|
| 468 |
+
"_annotator": str(series_get(record, "_annotator", default="")),
|
| 469 |
}
|
| 470 |
|
| 471 |
|
| 472 |
+
def make_blank_draft() -> Dict[str, Any]:
|
| 473 |
+
return dict(DEFAULT_OUTPUT_TEMPLATE)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 474 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 475 |
|
| 476 |
+
def generate_llm_distractor_draft(draft: Dict[str, Any], base_url: str, model: str, mode: str = "simple") -> Optional[Dict[str, Any]]:
|
| 477 |
+
if OpenAI is None:
|
| 478 |
+
st.error("The openai package is not installed.")
|
| 479 |
+
return None
|
| 480 |
+
|
| 481 |
+
client = OpenAI(base_url=base_url, api_key="lm-studio")
|
| 482 |
+
convo = draft.get("conversation", [])
|
| 483 |
+
sysinst = draft.get("system_instruction", "")
|
| 484 |
+
domain = draft.get("domain", "")
|
| 485 |
+
scenario = draft.get("scenario", "")
|
| 486 |
+
|
| 487 |
+
if mode == "simple":
|
| 488 |
+
prompt = f"""
|
| 489 |
+
You are helping create a human-made distractor dataset for a task-oriented assistant.
|
| 490 |
|
| 491 |
+
Domain: {domain}
|
| 492 |
+
Scenario: {scenario}
|
| 493 |
|
| 494 |
+
System instruction:
|
| 495 |
+
{sysinst}
|
| 496 |
+
|
| 497 |
+
Conversation:
|
| 498 |
+
{json.dumps(convo, ensure_ascii=False, indent=2)}
|
| 499 |
+
|
| 500 |
+
Write ONE realistic off-topic distractor pair:
|
| 501 |
+
- bot_turn: exact assistant turn from the conversation to anchor after
|
| 502 |
+
- distractor: the user's off-topic message
|
| 503 |
+
|
| 504 |
+
Return only valid JSON with keys bot_turn and distractor.
|
| 505 |
+
"""
|
| 506 |
+
else:
|
| 507 |
+
prompt = f"""
|
| 508 |
+
You are helping create a human-made multi-turn distractor dataset for a task-oriented assistant.
|
| 509 |
+
|
| 510 |
+
Domain: {domain}
|
| 511 |
+
Scenario: {scenario}
|
| 512 |
+
|
| 513 |
+
System instruction:
|
| 514 |
+
{sysinst}
|
| 515 |
+
|
| 516 |
+
Conversation:
|
| 517 |
+
{json.dumps(convo, ensure_ascii=False, indent=2)}
|
| 518 |
+
|
| 519 |
+
Write ONE multi-turn distractor item:
|
| 520 |
+
- off_topic_subject
|
| 521 |
+
- tactic_used
|
| 522 |
+
- bot_turn
|
| 523 |
+
- turns: a JSON list of 3-5 turns that starts with a user off-topic request and escalates politely after refusals.
|
| 524 |
+
|
| 525 |
+
Return only valid JSON with keys off_topic_subject, tactic_used, bot_turn, turns.
|
| 526 |
+
"""
|
| 527 |
|
|
|
|
|
|
|
| 528 |
try:
|
| 529 |
+
response = client.chat.completions.create(
|
| 530 |
+
model=model,
|
| 531 |
+
messages=[
|
| 532 |
+
{"role": "system", "content": "Return valid JSON only."},
|
| 533 |
+
{"role": "user", "content": prompt},
|
| 534 |
+
],
|
| 535 |
+
temperature=0.8,
|
| 536 |
+
max_tokens=1500,
|
| 537 |
+
)
|
| 538 |
+
raw = response.choices[0].message.content.strip()
|
| 539 |
+
if raw.startswith("```"):
|
| 540 |
+
raw = raw.strip("`")
|
| 541 |
+
raw = raw.replace("json\n", "", 1)
|
| 542 |
+
return json.loads(raw)
|
| 543 |
+
except Exception as e:
|
| 544 |
+
st.error(f"Local LLM generation failed: {e}")
|
| 545 |
+
return None
|
| 546 |
|
| 547 |
|
| 548 |
+
# ---------------------------------------------------------
|
| 549 |
+
# UI components
|
| 550 |
+
# ---------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 551 |
|
| 552 |
+
def render_preview_df(records: List[Dict[str, Any]], split_filter: str, search_text: str = "") -> pd.DataFrame:
|
| 553 |
+
rows = []
|
| 554 |
+
search_text = search_text.lower().strip()
|
| 555 |
+
for i, r in enumerate(records):
|
| 556 |
+
if split_filter and split_filter != "All" and str(r.get("split", r.get("_source_split", ""))) != split_filter:
|
| 557 |
+
continue
|
| 558 |
+
domain = str(series_get(r, "domain", default=""))
|
| 559 |
+
scenario = str(series_get(r, "scenario", default=""))
|
| 560 |
+
if search_text:
|
| 561 |
+
joined = " ".join([domain, scenario, str(series_get(r, "system_instruction", default=""))]).lower()
|
| 562 |
+
if search_text not in joined:
|
| 563 |
+
continue
|
| 564 |
+
convo = normalize_conversation(r.get("conversation"))
|
| 565 |
+
preview = ""
|
| 566 |
+
if convo:
|
| 567 |
+
for t in reversed(convo):
|
| 568 |
+
if str(t.get("role", "")).lower() == "user":
|
| 569 |
+
preview = str(t.get("content", ""))
|
| 570 |
+
break
|
| 571 |
+
if not preview:
|
| 572 |
+
preview = str(convo[-1].get("content", ""))
|
| 573 |
+
rows.append({
|
| 574 |
+
"#": i,
|
| 575 |
+
"split": str(r.get("split", r.get("_source_split", ""))),
|
| 576 |
+
"domain": domain,
|
| 577 |
+
"scenario": scenario,
|
| 578 |
+
"conversation_preview": (preview[:120] + "…") if len(preview) > 120 else preview,
|
| 579 |
+
"distractor_count": len(r.get("distractors", [])) if isinstance(r.get("distractors"), list) else 0,
|
| 580 |
+
"multi_count": len(r.get("distractors_multiturn", [])) if isinstance(r.get("distractors_multiturn"), list) else 0,
|
| 581 |
+
})
|
| 582 |
+
return pd.DataFrame(rows)
|
| 583 |
+
|
| 584 |
+
|
| 585 |
+
def current_source_record(records: List[Dict[str, Any]], idx: int) -> Optional[Dict[str, Any]]:
|
| 586 |
+
if idx < 0 or idx >= len(records):
|
| 587 |
+
return None
|
| 588 |
+
return records[idx]
|
| 589 |
+
|
| 590 |
+
|
| 591 |
+
def clean_editor_df(df: pd.DataFrame) -> pd.DataFrame:
|
| 592 |
+
if df is None:
|
| 593 |
+
return pd.DataFrame()
|
| 594 |
+
df = df.copy()
|
| 595 |
+
for col in df.columns:
|
| 596 |
+
df[col] = df[col].fillna("")
|
| 597 |
+
return df
|
| 598 |
+
|
| 599 |
+
|
| 600 |
+
# ---------------------------------------------------------
|
| 601 |
+
# App
|
| 602 |
+
# ---------------------------------------------------------
|
| 603 |
|
| 604 |
+
def main() -> None:
|
| 605 |
+
st.set_page_config(page_title=APP_TITLE, page_icon="🧭", layout="wide")
|
| 606 |
+
st.title(APP_TITLE)
|
| 607 |
+
st.caption("Simple collaborative editor for human-made distractor datasets.")
|
| 608 |
+
|
| 609 |
+
# Session defaults
|
| 610 |
+
for key, default in [
|
| 611 |
+
("annotator", "annotator_1"),
|
| 612 |
+
("source_mode", "HF dataset"),
|
| 613 |
+
("source_repo", DEFAULT_SOURCE_REPO),
|
| 614 |
+
("source_splits", DEFAULT_SOURCE_SPLITS),
|
| 615 |
+
("annotation_repo", DEFAULT_ANNOTATION_REPO),
|
| 616 |
+
("source_file_name", ""),
|
| 617 |
+
("source_row_idx", 0),
|
| 618 |
+
("draft", make_blank_draft()),
|
| 619 |
+
("draft_source_idx", None),
|
| 620 |
+
("draft_source_split", "train"),
|
| 621 |
+
("draft_mode", "new"),
|
| 622 |
+
("last_saved_message", ""),
|
| 623 |
+
("llm_base_url", "http://localhost:1234/v1"),
|
| 624 |
+
("llm_model", "gemma-4-e2b-it"),
|
| 625 |
+
("llm_mode", "simple"),
|
| 626 |
+
]:
|
| 627 |
+
if key not in st.session_state:
|
| 628 |
+
st.session_state[key] = default
|
| 629 |
+
|
| 630 |
+
# Sidebar
|
| 631 |
+
st.sidebar.header("Workspace")
|
| 632 |
+
st.session_state["annotator"] = st.sidebar.text_input("Annotator name", value=st.session_state["annotator"])
|
| 633 |
+
st.session_state["source_mode"] = st.sidebar.radio("Source mode", ["HF dataset", "Upload local JSON/JSONL"], index=0 if st.session_state["source_mode"] == "HF dataset" else 1)
|
| 634 |
+
st.session_state["source_repo"] = st.sidebar.text_input("Source dataset repo", value=st.session_state["source_repo"])
|
| 635 |
+
st.session_state["source_splits"] = st.sidebar.text_input("Source splits (comma-separated)", value=st.session_state["source_splits"])
|
| 636 |
+
st.session_state["annotation_repo"] = st.sidebar.text_input("Annotation dataset repo", value=st.session_state["annotation_repo"])
|
| 637 |
+
st.sidebar.divider()
|
| 638 |
+
st.session_state["llm_base_url"] = st.sidebar.text_input("Local LLM base URL", value=st.session_state["llm_base_url"])
|
| 639 |
+
st.session_state["llm_model"] = st.sidebar.text_input("Local LLM model", value=st.session_state["llm_model"])
|
| 640 |
+
st.session_state["llm_mode"] = st.sidebar.selectbox("LLM generation mode", ["simple", "multiturn"], index=0 if st.session_state["llm_mode"] == "simple" else 1)
|
| 641 |
+
st.sidebar.caption("For LM Studio / OpenAI-compatible local servers, keep the base URL like http://localhost:1234/v1.")
|
| 642 |
+
st.sidebar.divider()
|
| 643 |
+
|
| 644 |
+
uploaded_file = None
|
| 645 |
+
if st.session_state["source_mode"] == "Upload local JSON/JSONL":
|
| 646 |
+
uploaded_file = st.sidebar.file_uploader("Upload source file", type=["json", "jsonl"])
|
| 647 |
+
if uploaded_file is not None:
|
| 648 |
+
st.session_state["source_file_name"] = uploaded_file.name
|
| 649 |
+
|
| 650 |
+
page = st.sidebar.radio("Page", ["Browse", "Edit / Create", "Drafts", "Export / Sync"], index=0)
|
| 651 |
+
st.sidebar.caption(f"HF token present: {'yes' if DEFAULT_HF_TOKEN else 'no'}")
|
| 652 |
+
st.sidebar.caption(f"Draft folder: {DRAFT_DIR}")
|
| 653 |
+
|
| 654 |
+
# Load source records
|
| 655 |
if "source_records" not in st.session_state:
|
| 656 |
st.session_state["source_records"] = None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 657 |
|
| 658 |
if st.session_state["source_records"] is None:
|
| 659 |
+
with st.spinner("Loading source data..."):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 660 |
try:
|
| 661 |
+
if st.session_state["source_mode"] == "HF dataset":
|
| 662 |
+
records = load_hf_all_splits(st.session_state["source_repo"], st.session_state["source_splits"])
|
| 663 |
+
else:
|
| 664 |
+
if uploaded_file is not None:
|
| 665 |
+
suffix = Path(uploaded_file.name).suffix.lower()
|
| 666 |
+
tmp_path = DRAFT_DIR / f"uploaded_source{suffix}"
|
| 667 |
+
tmp_path.write_bytes(uploaded_file.getbuffer())
|
| 668 |
+
records = load_local_json(tmp_path)
|
| 669 |
+
else:
|
| 670 |
+
records = []
|
| 671 |
+
st.session_state["source_records"] = coerce_source_records(records)
|
| 672 |
except Exception as e:
|
| 673 |
+
st.session_state["source_records"] = []
|
| 674 |
+
st.error(f"Could not load source data: {e}")
|
| 675 |
+
|
| 676 |
+
records: List[Dict[str, Any]] = st.session_state["source_records"] or []
|
| 677 |
+
|
| 678 |
+
if page == "Browse":
|
| 679 |
+
st.subheader("Browse source dataset")
|
| 680 |
+
split_choices = ["All"] + sorted({str(r.get("split", r.get("_source_split", ""))) for r in records if str(r.get("split", r.get("_source_split", "")))} )
|
| 681 |
+
col1, col2 = st.columns([1, 1])
|
| 682 |
+
with col1:
|
| 683 |
+
split_filter = st.selectbox("Filter split", split_choices, index=0)
|
| 684 |
+
with col2:
|
| 685 |
+
search_text = st.text_input("Search text (domain / scenario / instruction)", value="")
|
| 686 |
+
preview_df = render_preview_df(records, split_filter, search_text)
|
| 687 |
+
st.write(f"Rows loaded: {len(preview_df)}")
|
| 688 |
+
st.dataframe(preview_df, use_container_width=True, hide_index=True)
|
| 689 |
+
|
| 690 |
+
if preview_df.empty:
|
| 691 |
+
st.info("No rows match the current filter.")
|
| 692 |
+
else:
|
| 693 |
+
picked = st.number_input("Pick row number (#)", min_value=0, max_value=max(0, len(preview_df) - 1), value=0, step=1)
|
| 694 |
+
if st.button("Load selected row into editor"):
|
| 695 |
+
selected_global_idx = int(preview_df.iloc[int(picked)]["#"])
|
| 696 |
+
st.session_state["draft_mode"] = "clone"
|
| 697 |
+
st.session_state["draft_source_idx"] = selected_global_idx
|
| 698 |
+
st.session_state["draft_source_split"] = str(records[selected_global_idx].get("split", records[selected_global_idx].get("_source_split", "train")))
|
| 699 |
+
st.session_state["draft"] = normalize_draft_from_record(
|
| 700 |
+
records[selected_global_idx],
|
| 701 |
+
source_repo=st.session_state["source_repo"],
|
| 702 |
+
source_split=st.session_state["draft_source_split"],
|
| 703 |
+
source_index=selected_global_idx,
|
| 704 |
+
)
|
| 705 |
+
st.success(f"Loaded row {selected_global_idx} into the editor.")
|
| 706 |
+
st.rerun()
|
| 707 |
+
|
| 708 |
+
st.markdown("### Record inspector")
|
| 709 |
+
if preview_df.empty:
|
| 710 |
+
st.stop()
|
| 711 |
+
idx = int(preview_df.iloc[int(picked)]["#"])
|
| 712 |
+
rec = records[idx]
|
| 713 |
+
st.json({
|
| 714 |
+
"domain": rec.get("domain", ""),
|
| 715 |
+
"scenario": rec.get("scenario", ""),
|
| 716 |
+
"split": rec.get("split", rec.get("_source_split", "")),
|
| 717 |
+
"keys": list(rec.keys()),
|
| 718 |
+
})
|
| 719 |
+
st.markdown("**Conversation preview**")
|
| 720 |
+
st.code(pretty_json(rec.get("conversation", [])), language="json")
|
| 721 |
+
st.markdown("**Distractors preview**")
|
| 722 |
+
st.code(pretty_json(rec.get("distractors", [])), language="json")
|
| 723 |
+
|
| 724 |
+
elif page == "Edit / Create":
|
| 725 |
+
st.subheader("Create or edit an entry")
|
| 726 |
left, right = st.columns([1.05, 0.95], gap="large")
|
| 727 |
|
| 728 |
with left:
|
| 729 |
+
c1, c2, c3 = st.columns([1, 1, 1])
|
| 730 |
+
with c1:
|
| 731 |
+
if st.button("New blank entry"):
|
| 732 |
+
st.session_state["draft_mode"] = "new"
|
| 733 |
+
st.session_state["draft_source_idx"] = None
|
| 734 |
+
st.session_state["draft"] = make_blank_draft()
|
| 735 |
+
st.success("Blank entry created.")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 736 |
st.rerun()
|
| 737 |
+
with c2:
|
| 738 |
+
if st.button("Reset draft from source row"):
|
| 739 |
+
idx = st.session_state.get("draft_source_idx")
|
| 740 |
+
if idx is not None and 0 <= idx < len(records):
|
| 741 |
+
st.session_state["draft"] = normalize_draft_from_record(
|
| 742 |
+
records[idx],
|
| 743 |
+
source_repo=st.session_state["source_repo"],
|
| 744 |
+
source_split=str(records[idx].get("split", records[idx].get("_source_split", "train"))),
|
| 745 |
+
source_index=idx,
|
| 746 |
+
)
|
| 747 |
+
st.success("Draft reset from source row.")
|
| 748 |
+
else:
|
| 749 |
+
st.warning("No source row selected.")
|
| 750 |
+
with c3:
|
| 751 |
+
if st.button("Auto-build conversation_with_distractors"):
|
| 752 |
+
d = st.session_state["draft"]
|
| 753 |
+
d["conversation_with_distractors"] = build_conversation_with_distractors(d.get("conversation", []), d.get("distractors_multiturn", []))
|
| 754 |
+
st.session_state["draft"] = d
|
| 755 |
+
st.success("Built conversation_with_distractors.")
|
| 756 |
st.rerun()
|
| 757 |
|
| 758 |
+
st.markdown("### Source row")
|
| 759 |
+
row_idx = st.number_input(
|
| 760 |
+
"Source row index",
|
| 761 |
+
min_value=0,
|
| 762 |
+
max_value=max(0, len(records) - 1),
|
| 763 |
+
value=int(st.session_state.get("draft_source_idx") or 0),
|
| 764 |
+
step=1,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 765 |
)
|
| 766 |
+
source_split_guess = ""
|
| 767 |
+
if records:
|
| 768 |
+
source_split_guess = str(records[int(row_idx)].get("split", records[int(row_idx)].get("_source_split", "train")))
|
| 769 |
+
st.write("Detected source split:", source_split_guess or "n/a")
|
| 770 |
+
if st.button("Load this row"):
|
| 771 |
+
idx = int(row_idx)
|
| 772 |
+
if 0 <= idx < len(records):
|
| 773 |
+
st.session_state["draft_mode"] = "clone"
|
| 774 |
+
st.session_state["draft_source_idx"] = idx
|
| 775 |
+
st.session_state["draft_source_split"] = str(records[idx].get("split", records[idx].get("_source_split", "train")))
|
| 776 |
+
st.session_state["draft"] = normalize_draft_from_record(
|
| 777 |
+
records[idx],
|
| 778 |
+
source_repo=st.session_state["source_repo"],
|
| 779 |
+
source_split=st.session_state["draft_source_split"],
|
| 780 |
+
source_index=idx,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 781 |
)
|
| 782 |
+
st.success(f"Loaded source row {idx}.")
|
| 783 |
+
st.rerun()
|
| 784 |
|
| 785 |
+
draft = st.session_state["draft"]
|
| 786 |
+
|
| 787 |
+
top1, top2, top3 = st.columns(3)
|
| 788 |
+
with top1:
|
| 789 |
+
draft["split"] = st.selectbox("Entry split", ["train", "test"], index=0 if str(draft.get("split", "train")) == "train" else 1)
|
| 790 |
+
with top2:
|
| 791 |
+
draft["_review_status"] = st.selectbox("Review status", ["draft", "approved", "failed"], index=["draft", "approved", "failed"].index(str(draft.get("_review_status", "draft"))))
|
| 792 |
+
with top3:
|
| 793 |
+
draft["_needs_human_review"] = st.checkbox("Needs human review", value=bool(draft.get("_needs_human_review", True)))
|
| 794 |
+
|
| 795 |
+
draft["domain"] = st.text_input("Domain", value=str(draft.get("domain", "")))
|
| 796 |
+
draft["scenario"] = st.text_input("Scenario", value=str(draft.get("scenario", "")))
|
| 797 |
+
draft["system_instruction"] = st.text_area("System instruction", value=str(draft.get("system_instruction", "")), height=180)
|
| 798 |
+
|
| 799 |
+
st.markdown("#### Conversation")
|
| 800 |
+
conv_df = clean_editor_df(pd.DataFrame(draft.get("conversation", [{"role": "user", "content": ""}])))
|
| 801 |
+
conv_df = st.data_editor(
|
| 802 |
+
conv_df,
|
| 803 |
+
num_rows="dynamic",
|
| 804 |
+
use_container_width=True,
|
| 805 |
+
column_config={
|
| 806 |
+
"role": st.column_config.SelectboxColumn("role", options=TURN_ROLES, required=True),
|
| 807 |
+
"content": st.column_config.TextColumn("content", required=True),
|
| 808 |
+
},
|
| 809 |
+
hide_index=True,
|
| 810 |
+
key="conversation_editor",
|
| 811 |
+
)
|
| 812 |
+
draft["conversation"] = turns_from_df(conv_df)
|
| 813 |
+
if st.button("Clear conversation"):
|
| 814 |
+
draft["conversation"] = []
|
| 815 |
+
st.session_state["draft"] = draft
|
| 816 |
+
st.rerun()
|
| 817 |
+
|
| 818 |
+
st.markdown("#### Simple distractors")
|
| 819 |
+
simple_df = clean_editor_df(pd.DataFrame(draft.get("distractors", [{"bot_turn": "", "distractor": ""}])))
|
| 820 |
+
simple_df = st.data_editor(
|
| 821 |
+
simple_df,
|
| 822 |
+
num_rows="dynamic",
|
| 823 |
+
use_container_width=True,
|
| 824 |
+
column_config={
|
| 825 |
+
"bot_turn": st.column_config.TextColumn("bot_turn"),
|
| 826 |
+
"distractor": st.column_config.TextColumn("distractor"),
|
| 827 |
+
},
|
| 828 |
+
hide_index=True,
|
| 829 |
+
key="simple_distractors_editor",
|
| 830 |
+
)
|
| 831 |
+
draft["distractors"] = simple_distractors_from_df(simple_df)
|
| 832 |
+
if st.button("Clear simple distractors"):
|
| 833 |
+
draft["distractors"] = []
|
| 834 |
+
st.session_state["draft"] = draft
|
| 835 |
+
st.rerun()
|
| 836 |
+
|
| 837 |
+
st.markdown("#### Multi-turn distractors")
|
| 838 |
+
multi_df = clean_editor_df(pd.DataFrame(draft.get("distractors_multiturn", [{"off_topic_subject": "", "tactic_used": TACTICS[0], "bot_turn": "", "turns_json": "[]"}])))
|
| 839 |
+
multi_df = st.data_editor(
|
| 840 |
+
multi_df,
|
| 841 |
+
num_rows="dynamic",
|
| 842 |
+
use_container_width=True,
|
| 843 |
+
column_config={
|
| 844 |
+
"off_topic_subject": st.column_config.TextColumn("off_topic_subject"),
|
| 845 |
+
"tactic_used": st.column_config.SelectboxColumn("tactic_used", options=TACTICS, required=False),
|
| 846 |
+
"bot_turn": st.column_config.TextColumn("bot_turn"),
|
| 847 |
+
"turns_json": st.column_config.TextColumn("turns_json", help="JSON list of turns, e.g. [{\"role\":\"user\",\"content\":\"...\"}]"),
|
| 848 |
+
},
|
| 849 |
+
hide_index=True,
|
| 850 |
+
key="multi_distractors_editor",
|
| 851 |
+
)
|
| 852 |
+
draft["distractors_multiturn"] = multiturn_from_df(multi_df)
|
| 853 |
+
if st.button("Clear multi-turn distractors"):
|
| 854 |
+
draft["distractors_multiturn"] = []
|
| 855 |
+
st.session_state["draft"] = draft
|
| 856 |
+
st.rerun()
|
| 857 |
+
|
| 858 |
+
st.markdown("#### Conversation with distractors")
|
| 859 |
+
if st.button("Auto-generate conversation_with_distractors from current draft"):
|
| 860 |
+
draft["conversation_with_distractors"] = build_conversation_with_distractors(draft.get("conversation", []), draft.get("distractors_multiturn", []))
|
| 861 |
+
st.session_state["draft"] = draft
|
| 862 |
+
convwd_text = st.text_area(
|
| 863 |
+
"conversation_with_distractors (JSON)",
|
| 864 |
+
value=pretty_json(draft.get("conversation_with_distractors", [])),
|
| 865 |
+
height=200,
|
| 866 |
+
)
|
| 867 |
+
if st.button("Apply conversation_with_distractors JSON"):
|
| 868 |
+
draft["conversation_with_distractors"] = safe_json_loads(convwd_text, [])
|
| 869 |
+
st.session_state["draft"] = draft
|
| 870 |
|
| 871 |
+
st.markdown("#### Quick LLM assist")
|
| 872 |
+
c1, c2 = st.columns([1, 1])
|
| 873 |
with c1:
|
| 874 |
+
if st.button("Generate draft with local LLM"):
|
| 875 |
+
out = generate_llm_distractor_draft(
|
| 876 |
+
draft,
|
| 877 |
+
base_url=st.session_state["llm_base_url"],
|
| 878 |
+
model=st.session_state["llm_model"],
|
| 879 |
+
mode=st.session_state["llm_mode"],
|
| 880 |
+
)
|
| 881 |
+
if out:
|
| 882 |
+
if st.session_state["llm_mode"] == "simple":
|
| 883 |
+
draft.setdefault("distractors", [])
|
| 884 |
+
draft["distractors"].append({
|
| 885 |
+
"bot_turn": out.get("bot_turn", ""),
|
| 886 |
+
"distractor": out.get("distractor", ""),
|
| 887 |
+
})
|
| 888 |
+
else:
|
| 889 |
+
draft.setdefault("distractors_multiturn", [])
|
| 890 |
+
draft["distractors_multiturn"].append({
|
| 891 |
+
"off_topic_subject": out.get("off_topic_subject", ""),
|
| 892 |
+
"tactic_used": out.get("tactic_used", ""),
|
| 893 |
+
"bot_turn": out.get("bot_turn", ""),
|
| 894 |
+
"turns_json": pretty_json(ensure_turns(out.get("turns", []))),
|
| 895 |
+
})
|
| 896 |
+
st.session_state["draft"] = draft
|
| 897 |
+
st.success("LLM draft inserted into the editor.")
|
| 898 |
+
st.rerun()
|
| 899 |
with c2:
|
| 900 |
+
st.caption("This calls a local OpenAI-compatible server such as LM Studio.")
|
| 901 |
+
|
| 902 |
+
st.markdown("#### Save / submit")
|
| 903 |
+
if st.button("Save draft locally"):
|
| 904 |
+
draft["_annotator"] = st.session_state["annotator"]
|
| 905 |
+
draft["_updated_at"] = now_iso()
|
| 906 |
+
path = save_draft_local(st.session_state["annotator"], draft)
|
| 907 |
+
st.success(f"Draft saved: {path}")
|
| 908 |
+
if st.button("Submit current entry to HF repo"):
|
| 909 |
+
final_record = record_from_inputs(
|
| 910 |
+
domain=draft.get("domain", ""),
|
| 911 |
+
scenario=draft.get("scenario", ""),
|
| 912 |
+
system_instruction=draft.get("system_instruction", ""),
|
| 913 |
+
conversation=draft.get("conversation", []),
|
| 914 |
+
distractors=draft.get("distractors", []),
|
| 915 |
+
multiturn=draft.get("distractors_multiturn", []),
|
| 916 |
+
conversation_with_distractors=draft.get("conversation_with_distractors", []),
|
| 917 |
+
split=str(draft.get("split", "train")),
|
| 918 |
+
review_status=str(draft.get("_review_status", "draft")),
|
| 919 |
+
needs_review=bool(draft.get("_needs_human_review", True)),
|
| 920 |
+
source_split=st.session_state.get("draft_source_split", ""),
|
| 921 |
+
source_index=st.session_state.get("draft_source_idx"),
|
| 922 |
+
source_repo=st.session_state["source_repo"],
|
| 923 |
+
annotator=st.session_state["annotator"],
|
| 924 |
+
)
|
| 925 |
try:
|
| 926 |
+
filename = upload_record_to_hf(st.session_state["annotation_repo"], final_record, st.session_state["annotator"])
|
| 927 |
+
append_submission_index({
|
| 928 |
+
"annotator": st.session_state["annotator"],
|
| 929 |
+
"uploaded_file": filename,
|
| 930 |
+
"split": final_record.get("split", ""),
|
| 931 |
+
"domain": final_record.get("domain", ""),
|
| 932 |
+
"scenario": final_record.get("scenario", ""),
|
| 933 |
+
"created_at": now_iso(),
|
| 934 |
+
})
|
| 935 |
+
save_draft_local(st.session_state["annotator"], draft)
|
| 936 |
+
st.success(f"Submitted to HF as {filename}")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 937 |
except Exception as e:
|
| 938 |
+
st.error(f"HF upload failed: {e}")
|
| 939 |
+
st.warning("The draft remains saved locally in the bucket.")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 940 |
|
| 941 |
+
with right:
|
| 942 |
+
st.markdown("### Current draft preview")
|
| 943 |
+
st.json(st.session_state["draft"])
|
| 944 |
+
st.markdown("### Quick notes")
|
| 945 |
+
st.write("The output keeps the same top-level structure as the source file and adds provenance fields such as split, annotator, and source index.")
|
| 946 |
+
st.write("You can edit each cell directly in the tables, add rows dynamically, and clear whole sections with the buttons on the left.")
|
| 947 |
+
|
| 948 |
+
elif page == "Drafts":
|
| 949 |
+
st.subheader("Drafts and submissions")
|
| 950 |
+
draft = load_draft_local(st.session_state["annotator"])
|
| 951 |
+
c1, c2 = st.columns([1, 1])
|
| 952 |
+
with c1:
|
| 953 |
+
st.markdown("### Saved local draft")
|
| 954 |
+
if draft:
|
| 955 |
+
st.json(draft)
|
| 956 |
+
else:
|
| 957 |
+
st.info("No draft saved for this annotator yet.")
|
| 958 |
+
with c2:
|
| 959 |
+
st.markdown("### Submission index")
|
| 960 |
+
idx_file = DRAFT_DIR / "submissions_index.jsonl"
|
| 961 |
+
if idx_file.exists():
|
| 962 |
+
lines = idx_file.read_text(encoding="utf-8").splitlines()
|
| 963 |
+
rows = [json.loads(x) for x in lines if x.strip()]
|
| 964 |
+
st.dataframe(pd.DataFrame(rows), use_container_width=True, hide_index=True)
|
| 965 |
+
else:
|
| 966 |
+
st.info("No submissions recorded yet.")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 967 |
|
| 968 |
else:
|
| 969 |
+
st.subheader("Export / Sync")
|
| 970 |
+
st.write("Export current source + drafts as a merged JSONL or CSV, or inspect HF uploads.")
|
| 971 |
+
current_draft = st.session_state.get("draft", make_blank_draft())
|
| 972 |
|
| 973 |
+
# Build a merged dataset view from source records plus local draft if populated
|
| 974 |
+
merged = [dict(r) for r in records]
|
| 975 |
+
if current_draft and current_draft.get("domain") and current_draft.get("scenario"):
|
| 976 |
+
merged.append(record_to_exportable(current_draft))
|
| 977 |
+
|
| 978 |
+
c1, c2, c3 = st.columns(3)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 979 |
with c1:
|
| 980 |
+
if st.button("Write merged JSONL export"):
|
| 981 |
+
path = EXPORT_DIR / "merged_dataset.jsonl"
|
| 982 |
+
with path.open("w", encoding="utf-8") as f:
|
| 983 |
+
for r in merged:
|
| 984 |
+
f.write(json.dumps(r, ensure_ascii=False) + "\n")
|
| 985 |
+
st.success(f"Wrote {path}")
|
| 986 |
+
st.download_button("Download merged JSONL", data=path.read_text(encoding="utf-8"), file_name=path.name, mime="application/json")
|
| 987 |
with c2:
|
| 988 |
+
if st.button("Write merged CSV export"):
|
| 989 |
+
path = EXPORT_DIR / "merged_dataset.csv"
|
| 990 |
+
pd.json_normalize(merged).to_csv(path, index=False)
|
| 991 |
+
st.success(f"Wrote {path}")
|
| 992 |
+
st.download_button("Download merged CSV", data=path.read_text(encoding="utf-8"), file_name=path.name, mime="text/csv")
|
| 993 |
+
with c3:
|
| 994 |
+
if st.button("Refresh HF file list"):
|
| 995 |
+
st.rerun()
|
| 996 |
+
|
| 997 |
+
st.markdown("### Uploaded files in annotation repo")
|
| 998 |
+
files = list_uploaded_files(st.session_state["annotation_repo"])
|
| 999 |
+
if files:
|
| 1000 |
+
st.dataframe(pd.DataFrame({"file": files}), use_container_width=True, hide_index=True)
|
| 1001 |
+
else:
|
| 1002 |
+
st.info("No repository files listed yet, or repo is not configured.")
|
| 1003 |
|
| 1004 |
+
st.markdown("### Repository settings to remember")
|
| 1005 |
st.code(
|
| 1006 |
+
f"SOURCE_DATASET_REPO={st.session_state['source_repo']}\n"
|
| 1007 |
+
f"SOURCE_DATASET_SPLITS={st.session_state['source_splits']}\n"
|
| 1008 |
+
f"ANNOTATION_REPO_ID={st.session_state['annotation_repo']}\n"
|
| 1009 |
+
f"HF_TOKEN={'set' if DEFAULT_HF_TOKEN else 'missing'}",
|
| 1010 |
language="text",
|
| 1011 |
)
|
| 1012 |
|
hf-space/hf-space/hf-space/app.py
CHANGED
|
@@ -543,11 +543,33 @@ def main() -> None:
|
|
| 543 |
item = current_item_row()
|
| 544 |
if item is None:
|
| 545 |
st.info("Claim an item to start. The app keeps a per-annotator queue so multiple people can work in parallel.")
|
|
|
|
| 546 |
q = queue_df().head(10)
|
|
|
|
|
|
|
|
|
|
|
|
|
| 547 |
if not q.empty:
|
| 548 |
-
|
| 549 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 550 |
st.dataframe(display, use_container_width=True, hide_index=True)
|
|
|
|
| 551 |
return
|
| 552 |
|
| 553 |
st.markdown(
|
|
|
|
| 543 |
item = current_item_row()
|
| 544 |
if item is None:
|
| 545 |
st.info("Claim an item to start. The app keeps a per-annotator queue so multiple people can work in parallel.")
|
| 546 |
+
|
| 547 |
q = queue_df().head(10)
|
| 548 |
+
|
| 549 |
+
# DEBUG: inspect actual dataset schema
|
| 550 |
+
st.write("Dataset columns:", list(q.columns))
|
| 551 |
+
|
| 552 |
if not q.empty:
|
| 553 |
+
|
| 554 |
+
# Only use columns that actually exist
|
| 555 |
+
available_cols = [
|
| 556 |
+
c for c in [
|
| 557 |
+
"item_id",
|
| 558 |
+
"sample_id",
|
| 559 |
+
"domain",
|
| 560 |
+
"scenario",
|
| 561 |
+
"distractor_index"
|
| 562 |
+
]
|
| 563 |
+
if c in q.columns
|
| 564 |
+
]
|
| 565 |
+
|
| 566 |
+
display = q[available_cols].copy()
|
| 567 |
+
|
| 568 |
+
if "distractor_text" in q.columns:
|
| 569 |
+
display["preview"] = q["distractor_text"].map(preview_text)
|
| 570 |
+
|
| 571 |
st.dataframe(display, use_container_width=True, hide_index=True)
|
| 572 |
+
|
| 573 |
return
|
| 574 |
|
| 575 |
st.markdown(
|
hf-space/hf-space/hf-space/hf-space/README.md
CHANGED
|
@@ -1,3 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# LLM Annotation Platform — Hugging Face native
|
| 2 |
|
| 3 |
This version removes the external database layer.
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: LLM Annotation Platform
|
| 3 |
+
emoji: 🧠
|
| 4 |
+
colorFrom: blue
|
| 5 |
+
colorTo: purple
|
| 6 |
+
sdk: docker
|
| 7 |
+
pinned: false
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
# LLM Annotation Platform — Hugging Face native
|
| 11 |
|
| 12 |
This version removes the external database layer.
|
hf-space/hf-space/hf-space/hf-space/hf-space/.env.example
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
SOURCE_DATASET_REPO=nvidia/CantTalkAboutThis-Topic-Control-Dataset
|
| 2 |
+
SOURCE_DATASET_SPLIT=train
|
| 3 |
+
ANNOTATION_REPO_ID=YOUR_ORG/llm-distractor-annotations
|
| 4 |
+
HF_TOKEN=
|
| 5 |
+
CACHE_DIR=/data/hf_annotation_cache
|
| 6 |
+
DRAFT_DIR=/data/hf_annotation_drafts
|
| 7 |
+
EXPORT_DIR=/data/hf_annotation_exports
|
hf-space/hf-space/hf-space/hf-space/hf-space/.github/workflows/sync-to-hf.yml
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
name: Sync to Hugging Face Space
|
| 2 |
+
|
| 3 |
+
on:
|
| 4 |
+
push:
|
| 5 |
+
branches:
|
| 6 |
+
- main
|
| 7 |
+
|
| 8 |
+
jobs:
|
| 9 |
+
sync-to-hub:
|
| 10 |
+
runs-on: ubuntu-latest
|
| 11 |
+
|
| 12 |
+
steps:
|
| 13 |
+
- name: Checkout repository
|
| 14 |
+
uses: actions/checkout@v4
|
| 15 |
+
with:
|
| 16 |
+
lfs: true
|
| 17 |
+
|
| 18 |
+
- name: Push to Hugging Face
|
| 19 |
+
env:
|
| 20 |
+
HF_TOKEN: ${{ secrets.HF_TOKEN }}
|
| 21 |
+
run: |
|
| 22 |
+
git config --global user.email "github-actions@github.com"
|
| 23 |
+
git config --global user.name "GitHub Actions"
|
| 24 |
+
|
| 25 |
+
git clone https://user:$HF_TOKEN@huggingface.co/spaces/keepingLLMontrack/llm-annotation-platform hf-space
|
| 26 |
+
|
| 27 |
+
rsync -av --exclude '.git' ./ hf-space/
|
| 28 |
+
|
| 29 |
+
cd hf-space
|
| 30 |
+
|
| 31 |
+
git add .
|
| 32 |
+
|
| 33 |
+
git commit -m "Sync from GitHub" || echo "No changes to commit"
|
| 34 |
+
|
| 35 |
+
git push
|
hf-space/hf-space/hf-space/hf-space/hf-space/.gitignore
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
__pycache__/
|
| 2 |
+
*.pyc
|
| 3 |
+
.streamlit/
|
| 4 |
+
data/
|
| 5 |
+
exports/
|
| 6 |
+
.env
|
| 7 |
+
.DS_Store
|
hf-space/hf-space/hf-space/hf-space/hf-space/Dockerfile
ADDED
|
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
FROM python:3.11-slim
|
| 2 |
+
|
| 3 |
+
WORKDIR /app
|
| 4 |
+
|
| 5 |
+
COPY . /app
|
| 6 |
+
|
| 7 |
+
RUN pip install --no-cache-dir -r requirements.txt
|
| 8 |
+
|
| 9 |
+
EXPOSE 7860
|
| 10 |
+
|
| 11 |
+
CMD ["streamlit", "run", "app.py", "--server.port", "7860", "--server.address", "0.0.0.0"]
|
hf-space/hf-space/hf-space/hf-space/hf-space/README.md
CHANGED
|
@@ -1,10 +1,84 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
-
|
| 9 |
-
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# LLM Annotation Platform — Hugging Face native
|
| 2 |
+
|
| 3 |
+
This version removes the external database layer.
|
| 4 |
+
|
| 5 |
+
## What it uses
|
| 6 |
+
|
| 7 |
+
- **Hugging Face Space** for the Streamlit app
|
| 8 |
+
- **Hugging Face dataset repo** for the canonical annotation store
|
| 9 |
+
- **Hugging Face Storage Bucket** only for persistent local cache / drafts in the Space
|
| 10 |
+
- **No Supabase**
|
| 11 |
+
- **No separate backend platform**
|
| 12 |
+
|
| 13 |
+
Hugging Face Spaces provide ephemeral disk by default, and Hugging Face recommends attaching Storage Buckets to persist data across restarts. Buckets are mounted into the Space container as local volumes. citeturn322583view0
|
| 14 |
+
|
| 15 |
+
## Repository structure
|
| 16 |
+
|
| 17 |
+
```text
|
| 18 |
+
app.py
|
| 19 |
+
scripts/seed.py
|
| 20 |
+
requirements.txt
|
| 21 |
+
README.md
|
| 22 |
+
```
|
| 23 |
+
|
| 24 |
+
## Behavior
|
| 25 |
+
|
| 26 |
+
Each annotation is written as its own JSON file into the dataset repository:
|
| 27 |
+
```text
|
| 28 |
+
annotations/<annotator>/<timestamp>_<item_id>_<uuid>.json
|
| 29 |
+
```
|
| 30 |
+
|
| 31 |
+
That design avoids write conflicts between annotators because each submission is a new file, not an overwrite of a shared database row. Repository files on the Hub are versioned, and the Hub supports uploading files to dataset repositories. citeturn322583view1turn322583view4
|
| 32 |
+
|
| 33 |
+
## Local run
|
| 34 |
+
|
| 35 |
+
```bash
|
| 36 |
+
pip install -r requirements.txt
|
| 37 |
+
streamlit run app.py
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
+
## How to set it up on Hugging Face
|
| 41 |
+
|
| 42 |
+
### 1. Create two dataset repositories
|
| 43 |
+
|
| 44 |
+
Create:
|
| 45 |
+
- one dataset repo for the **source / seed data**
|
| 46 |
+
- one dataset repo for the **annotations**
|
| 47 |
+
|
| 48 |
+
Hugging Face dataset repositories are created from the Hub UI, and dataset files plus revision history are stored in the repository. citeturn322583view1
|
| 49 |
+
|
| 50 |
+
### 2. Create a Space
|
| 51 |
+
|
| 52 |
+
Create a **Streamlit** Space and connect it to your GitHub repository. Spaces host apps directly on the Hub and support Streamlit as a built-in SDK. citeturn322583view2
|
| 53 |
+
|
| 54 |
+
### 3. Attach a Storage Bucket
|
| 55 |
+
|
| 56 |
+
Attach a Storage Bucket to the Space and mount it at `/data`.
|
| 57 |
+
|
| 58 |
+
This is the only stateful storage used by the app. It stores drafts and cache files and survives restarts. Hugging Face documents Storage Buckets as the recommended persistence mechanism for Spaces. citeturn322583view0
|
| 59 |
+
|
| 60 |
+
### 4. Add secrets
|
| 61 |
+
|
| 62 |
+
In the Space settings, add:
|
| 63 |
+
- `HF_TOKEN` — a Hugging Face token with **write** permission
|
| 64 |
+
- `SOURCE_DATASET_REPO`
|
| 65 |
+
- `SOURCE_DATASET_SPLIT`
|
| 66 |
+
- `ANNOTATION_REPO_ID`
|
| 67 |
+
|
| 68 |
+
Hugging Face recommends using Space secrets or environment variables instead of hard-coding sensitive values. A write token is required to create repositories or push content to the Hub. citeturn322583view2turn322583view4
|
| 69 |
+
|
| 70 |
+
### 5. Deploy
|
| 71 |
+
|
| 72 |
+
Commit the repo to GitHub. Once the Space is linked, it will build from the repository, and the app can upload annotation files to the dataset repo using the Hub API. Hugging Face’s Hub client supports `upload_file()` and `create_commit()` for repository writes. citeturn322583view3turn322583view4
|
| 73 |
+
|
| 74 |
+
## Suggested workflow for your group
|
| 75 |
+
|
| 76 |
+
- each person uses a stable annotator name
|
| 77 |
+
- each submission creates a new JSON file in the annotation repo
|
| 78 |
+
- the Review page shows items with 2+ annotations
|
| 79 |
+
- the Dashboard shows per-annotator and per-domain progress
|
| 80 |
+
- exports are generated from the merged source + annotation view
|
| 81 |
+
|
| 82 |
+
## Why this is a good fit
|
| 83 |
+
|
| 84 |
+
The original source dataset can still be loaded with `datasets.load_dataset(...)`, and the Hugging Face ecosystem is designed for pushing and versioning datasets directly on the Hub. The `datasets` library also provides a `push_to_hub()` path for dataset publishing, while `huggingface_hub` provides lower-level file upload methods when you want more control over file layout. citeturn674332search1turn674332search3turn322583view3
|
hf-space/hf-space/hf-space/hf-space/hf-space/app.py
ADDED
|
@@ -0,0 +1,853 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
import os
|
| 5 |
+
import uuid
|
| 6 |
+
from datetime import datetime, timezone
|
| 7 |
+
from pathlib import Path
|
| 8 |
+
from typing import Any, Dict, List, Optional, Tuple
|
| 9 |
+
|
| 10 |
+
import pandas as pd
|
| 11 |
+
import streamlit as st
|
| 12 |
+
from datasets import load_dataset
|
| 13 |
+
from huggingface_hub import HfApi, hf_hub_download
|
| 14 |
+
|
| 15 |
+
APP_TITLE = "🧭 LLM Annotation Platform"
|
| 16 |
+
DEFAULT_SOURCE_DATASET = os.environ.get(
|
| 17 |
+
"SOURCE_DATASET_REPO",
|
| 18 |
+
"nvidia/CantTalkAboutThis-Topic-Control-Dataset",
|
| 19 |
+
)
|
| 20 |
+
DEFAULT_SOURCE_SPLIT = os.environ.get("SOURCE_DATASET_SPLIT", "train")
|
| 21 |
+
DEFAULT_ANNOTATION_REPO = os.environ.get(
|
| 22 |
+
"ANNOTATION_REPO_ID",
|
| 23 |
+
"YOUR_ORG/llm-distractor-annotations",
|
| 24 |
+
)
|
| 25 |
+
DEFAULT_CACHE_DIR = Path(os.environ.get("CACHE_DIR", "/data/hf_annotation_cache"))
|
| 26 |
+
DEFAULT_CACHE_DIR.mkdir(parents=True, exist_ok=True)
|
| 27 |
+
LOCAL_DRAFT_DIR = Path(os.environ.get("DRAFT_DIR", "/data/hf_annotation_drafts"))
|
| 28 |
+
LOCAL_DRAFT_DIR.mkdir(parents=True, exist_ok=True)
|
| 29 |
+
LOCAL_EXPORT_DIR = Path(os.environ.get("EXPORT_DIR", "/data/hf_annotation_exports"))
|
| 30 |
+
LOCAL_EXPORT_DIR.mkdir(parents=True, exist_ok=True)
|
| 31 |
+
|
| 32 |
+
LABEL_OPTIONS = {
|
| 33 |
+
"distractor_kind": [
|
| 34 |
+
"benign off-topic",
|
| 35 |
+
"smooth bridge",
|
| 36 |
+
"policy-evasive",
|
| 37 |
+
"roleplay / impersonation",
|
| 38 |
+
"pressure / persistence",
|
| 39 |
+
"urgency / time pressure",
|
| 40 |
+
"loophole seeking",
|
| 41 |
+
"multi-turn escalation",
|
| 42 |
+
"other",
|
| 43 |
+
],
|
| 44 |
+
"assistant_behavior": [
|
| 45 |
+
"perfect refusal + redirect",
|
| 46 |
+
"helpful redirection",
|
| 47 |
+
"partial engagement",
|
| 48 |
+
"full engagement / derailment",
|
| 49 |
+
"over-refusal",
|
| 50 |
+
"unclear",
|
| 51 |
+
],
|
| 52 |
+
"transition_style": [
|
| 53 |
+
"abrupt",
|
| 54 |
+
"smooth bridge",
|
| 55 |
+
"follow-up clarification",
|
| 56 |
+
"rephrasing",
|
| 57 |
+
"escalation",
|
| 58 |
+
"roleplay",
|
| 59 |
+
"ambiguity exploitation",
|
| 60 |
+
"other",
|
| 61 |
+
],
|
| 62 |
+
"policy_target": [
|
| 63 |
+
"medical advice",
|
| 64 |
+
"financial advice",
|
| 65 |
+
"legal advice",
|
| 66 |
+
"competitor discussion",
|
| 67 |
+
"politics",
|
| 68 |
+
"unsafe content",
|
| 69 |
+
"personal data",
|
| 70 |
+
"company-specific policy",
|
| 71 |
+
"tone / style policy",
|
| 72 |
+
"other",
|
| 73 |
+
],
|
| 74 |
+
}
|
| 75 |
+
|
| 76 |
+
|
| 77 |
+
def now_iso() -> str:
|
| 78 |
+
return datetime.now(timezone.utc).isoformat()
|
| 79 |
+
|
| 80 |
+
|
| 81 |
+
def token() -> Optional[str]:
|
| 82 |
+
return os.environ.get("HF_TOKEN") or os.environ.get("HUGGINGFACE_HUB_TOKEN")
|
| 83 |
+
|
| 84 |
+
|
| 85 |
+
def api() -> HfApi:
|
| 86 |
+
return HfApi(token=token())
|
| 87 |
+
|
| 88 |
+
|
| 89 |
+
def annotation_file_name(item_id: str, annotator: str) -> str:
|
| 90 |
+
safe_annotator = "".join(ch if ch.isalnum() or ch in "-_." else "_" for ch in annotator.strip().lower()) or "annotator"
|
| 91 |
+
safe_item = "".join(ch if ch.isalnum() or ch in "-_." else "_" for ch in item_id.strip()) or "item"
|
| 92 |
+
stamp = datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ")
|
| 93 |
+
return f"annotations/{safe_annotator}/{stamp}_{safe_item}_{uuid.uuid4().hex[:8]}.json"
|
| 94 |
+
|
| 95 |
+
|
| 96 |
+
def draft_path(annotator: str) -> Path:
|
| 97 |
+
safe_annotator = "".join(ch if ch.isalnum() or ch in "-_." else "_" for ch in annotator.strip().lower()) or "annotator"
|
| 98 |
+
return LOCAL_DRAFT_DIR / f"{safe_annotator}.json"
|
| 99 |
+
|
| 100 |
+
|
| 101 |
+
def cache_annotations_dir() -> Path:
|
| 102 |
+
path = DEFAULT_CACHE_DIR / "annotations_snapshot"
|
| 103 |
+
path.mkdir(parents=True, exist_ok=True)
|
| 104 |
+
return path
|
| 105 |
+
|
| 106 |
+
|
| 107 |
+
def ensure_repo_exists(repo_id: str) -> None:
|
| 108 |
+
if repo_id.startswith("YOUR_ORG/") or not repo_id.strip():
|
| 109 |
+
return
|
| 110 |
+
api().create_repo(repo_id=repo_id, repo_type="dataset", private=True, exist_ok=True)
|
| 111 |
+
|
| 112 |
+
|
| 113 |
+
def load_source_dataset(repo_id: str, split: str) -> List[Dict[str, Any]]:
|
| 114 |
+
ds = load_dataset(repo_id, split=split)
|
| 115 |
+
return [dict(row) for row in ds]
|
| 116 |
+
|
| 117 |
+
|
| 118 |
+
def normalize_turns(turns: Any) -> List[Dict[str, Any]]:
|
| 119 |
+
if turns is None:
|
| 120 |
+
return []
|
| 121 |
+
if isinstance(turns, str):
|
| 122 |
+
try:
|
| 123 |
+
turns = json.loads(turns)
|
| 124 |
+
except Exception:
|
| 125 |
+
return []
|
| 126 |
+
if not isinstance(turns, list):
|
| 127 |
+
return []
|
| 128 |
+
out = []
|
| 129 |
+
for turn in turns:
|
| 130 |
+
if isinstance(turn, dict):
|
| 131 |
+
role = turn.get("role") or turn.get("speaker") or turn.get("type") or "unknown"
|
| 132 |
+
content = turn.get("content") or turn.get("text") or turn.get("utterance") or ""
|
| 133 |
+
out.append({"role": str(role), "content": str(content)})
|
| 134 |
+
else:
|
| 135 |
+
out.append({"role": "unknown", "content": str(turn)})
|
| 136 |
+
return out
|
| 137 |
+
|
| 138 |
+
|
| 139 |
+
def safe_sample_id(record: Dict[str, Any], fallback_index: int) -> str:
|
| 140 |
+
for key in ("sample_id", "id", "_id", "row_id"):
|
| 141 |
+
if record.get(key) not in (None, ""):
|
| 142 |
+
return str(record[key])
|
| 143 |
+
domain = str(record.get("domain", "sample")).replace(" ", "_")
|
| 144 |
+
scenario = str(record.get("scenario", "")).replace(" ", "_")
|
| 145 |
+
return f"{domain}-{scenario}-{fallback_index}"
|
| 146 |
+
|
| 147 |
+
|
| 148 |
+
def expand_record(record: Dict[str, Any], idx: int) -> Tuple[Dict[str, Any], List[Dict[str, Any]]]:
|
| 149 |
+
sample_id = safe_sample_id(record, idx)
|
| 150 |
+
conversation = normalize_turns(record.get("conversation"))
|
| 151 |
+
distractors = record.get("distractors") or []
|
| 152 |
+
if isinstance(distractors, str):
|
| 153 |
+
try:
|
| 154 |
+
distractors = json.loads(distractors)
|
| 155 |
+
except Exception:
|
| 156 |
+
distractors = []
|
| 157 |
+
if not isinstance(distractors, list):
|
| 158 |
+
distractors = []
|
| 159 |
+
|
| 160 |
+
sample = {
|
| 161 |
+
"sample_id": sample_id,
|
| 162 |
+
"domain": str(record.get("domain", "")),
|
| 163 |
+
"scenario": str(record.get("scenario", "")),
|
| 164 |
+
"system_instruction": str(record.get("system_instruction", "")),
|
| 165 |
+
"conversation_json": json.dumps(conversation, ensure_ascii=False),
|
| 166 |
+
"distractors_json": json.dumps(distractors, ensure_ascii=False),
|
| 167 |
+
"conversation_with_distractors_json": json.dumps(record.get("conversation_with_distractors", []), ensure_ascii=False),
|
| 168 |
+
"raw_json": json.dumps(record, ensure_ascii=False),
|
| 169 |
+
}
|
| 170 |
+
|
| 171 |
+
items = []
|
| 172 |
+
for distractor_index, d in enumerate(distractors):
|
| 173 |
+
bot_turn = ""
|
| 174 |
+
distractor_text = ""
|
| 175 |
+
if isinstance(d, dict):
|
| 176 |
+
bot_turn = str(
|
| 177 |
+
d.get("bot turn")
|
| 178 |
+
or d.get("bot_turn")
|
| 179 |
+
or d.get("assistant_turn")
|
| 180 |
+
or d.get("assistant")
|
| 181 |
+
or ""
|
| 182 |
+
)
|
| 183 |
+
distractor_text = str(
|
| 184 |
+
d.get("distractor")
|
| 185 |
+
or d.get("distractor user turn")
|
| 186 |
+
or d.get("user_turn")
|
| 187 |
+
or d.get("user")
|
| 188 |
+
or d.get("text")
|
| 189 |
+
or ""
|
| 190 |
+
)
|
| 191 |
+
else:
|
| 192 |
+
distractor_text = str(d)
|
| 193 |
+
|
| 194 |
+
items.append(
|
| 195 |
+
{
|
| 196 |
+
"item_id": f"{sample_id}::{distractor_index}",
|
| 197 |
+
"sample_id": sample_id,
|
| 198 |
+
"distractor_index": distractor_index,
|
| 199 |
+
"bot_turn": bot_turn,
|
| 200 |
+
"distractor_text": distractor_text,
|
| 201 |
+
}
|
| 202 |
+
)
|
| 203 |
+
return sample, items
|
| 204 |
+
|
| 205 |
+
|
| 206 |
+
def seed_source_index(records: List[Dict[str, Any]]) -> Tuple[pd.DataFrame, pd.DataFrame]:
|
| 207 |
+
samples = []
|
| 208 |
+
items = []
|
| 209 |
+
for idx, record in enumerate(records):
|
| 210 |
+
sample, record_items = expand_record(record, idx)
|
| 211 |
+
samples.append(sample)
|
| 212 |
+
items.extend(record_items)
|
| 213 |
+
return pd.DataFrame(samples), pd.DataFrame(items)
|
| 214 |
+
|
| 215 |
+
|
| 216 |
+
def read_json_file(path: Path) -> Dict[str, Any]:
|
| 217 |
+
with path.open("r", encoding="utf-8") as f:
|
| 218 |
+
return json.load(f)
|
| 219 |
+
|
| 220 |
+
|
| 221 |
+
def load_all_hub_annotations(annotation_repo_id: str) -> pd.DataFrame:
|
| 222 |
+
"""
|
| 223 |
+
Each submission is stored as a separate JSON file, which avoids write conflicts.
|
| 224 |
+
"""
|
| 225 |
+
if annotation_repo_id.startswith("YOUR_ORG/") or not annotation_repo_id.strip():
|
| 226 |
+
return pd.DataFrame(columns=["item_id", "annotator", "labels", "notes", "status", "created_at", "file_path"])
|
| 227 |
+
|
| 228 |
+
cache_dir = cache_annotations_dir()
|
| 229 |
+
file_list = api().list_repo_files(annotation_repo_id, repo_type="dataset")
|
| 230 |
+
ann_files = [f for f in file_list if f.startswith("annotations/") and f.endswith(".json")]
|
| 231 |
+
|
| 232 |
+
rows = []
|
| 233 |
+
for file_path in ann_files:
|
| 234 |
+
try:
|
| 235 |
+
local_path = hf_hub_download(
|
| 236 |
+
repo_id=annotation_repo_id,
|
| 237 |
+
repo_type="dataset",
|
| 238 |
+
filename=file_path,
|
| 239 |
+
token=token(),
|
| 240 |
+
local_dir=str(cache_dir),
|
| 241 |
+
local_dir_use_symlinks=False,
|
| 242 |
+
)
|
| 243 |
+
payload = read_json_file(Path(local_path))
|
| 244 |
+
rows.append(
|
| 245 |
+
{
|
| 246 |
+
"item_id": payload.get("item_id", ""),
|
| 247 |
+
"sample_id": payload.get("sample_id", ""),
|
| 248 |
+
"annotator": payload.get("annotator", ""),
|
| 249 |
+
"labels": payload.get("labels", {}),
|
| 250 |
+
"notes": payload.get("notes", ""),
|
| 251 |
+
"status": payload.get("status", "submitted"),
|
| 252 |
+
"created_at": payload.get("created_at", ""),
|
| 253 |
+
"file_path": file_path,
|
| 254 |
+
}
|
| 255 |
+
)
|
| 256 |
+
except Exception as e:
|
| 257 |
+
rows.append(
|
| 258 |
+
{
|
| 259 |
+
"item_id": "",
|
| 260 |
+
"sample_id": "",
|
| 261 |
+
"annotator": "",
|
| 262 |
+
"labels": {},
|
| 263 |
+
"notes": f"Failed to load {file_path}: {e}",
|
| 264 |
+
"status": "load_error",
|
| 265 |
+
"created_at": "",
|
| 266 |
+
"file_path": file_path,
|
| 267 |
+
}
|
| 268 |
+
)
|
| 269 |
+
|
| 270 |
+
return pd.DataFrame(rows) if rows else pd.DataFrame(columns=["item_id", "sample_id", "annotator", "labels", "notes", "status", "created_at", "file_path"])
|
| 271 |
+
|
| 272 |
+
|
| 273 |
+
def save_draft(annotator: str, payload: Dict[str, Any]) -> Path:
|
| 274 |
+
path = draft_path(annotator)
|
| 275 |
+
path.parent.mkdir(parents=True, exist_ok=True)
|
| 276 |
+
with path.open("w", encoding="utf-8") as f:
|
| 277 |
+
json.dump(payload, f, ensure_ascii=False, indent=2)
|
| 278 |
+
return path
|
| 279 |
+
|
| 280 |
+
|
| 281 |
+
def load_draft(annotator: str) -> Dict[str, Any]:
|
| 282 |
+
path = draft_path(annotator)
|
| 283 |
+
if not path.exists():
|
| 284 |
+
return {}
|
| 285 |
+
try:
|
| 286 |
+
return read_json_file(path)
|
| 287 |
+
except Exception:
|
| 288 |
+
return {}
|
| 289 |
+
|
| 290 |
+
|
| 291 |
+
def build_labels_from_state(prefix: str = "") -> Dict[str, Any]:
|
| 292 |
+
return {
|
| 293 |
+
"distractor_kind": st.session_state.get(f"{prefix}distractor_kind", LABEL_OPTIONS["distractor_kind"][0]),
|
| 294 |
+
"transition_style": st.session_state.get(f"{prefix}transition_style", LABEL_OPTIONS["transition_style"][0]),
|
| 295 |
+
"policy_target": st.session_state.get(f"{prefix}policy_target", []),
|
| 296 |
+
"difficulty": int(st.session_state.get(f"{prefix}difficulty", 3)),
|
| 297 |
+
"realism": int(st.session_state.get(f"{prefix}realism", 3)),
|
| 298 |
+
"assistant_behavior": st.session_state.get(f"{prefix}assistant_behavior", LABEL_OPTIONS["assistant_behavior"][0]),
|
| 299 |
+
"multi_turn_escalation": bool(st.session_state.get(f"{prefix}multi_turn_escalation", False)),
|
| 300 |
+
"rule_followed": bool(st.session_state.get(f"{prefix}rule_followed", True)),
|
| 301 |
+
"needs_review": bool(st.session_state.get(f"{prefix}needs_review", False)),
|
| 302 |
+
"confidence": int(st.session_state.get(f"{prefix}confidence", 3)),
|
| 303 |
+
}
|
| 304 |
+
|
| 305 |
+
|
| 306 |
+
def preview_text(text: str, limit: int = 280) -> str:
|
| 307 |
+
txt = (text or "").strip().replace("\n", " ")
|
| 308 |
+
if len(txt) <= limit:
|
| 309 |
+
return txt
|
| 310 |
+
return txt[:limit - 1] + "…"
|
| 311 |
+
|
| 312 |
+
|
| 313 |
+
def render_turns(turns: List[Dict[str, Any]]) -> None:
|
| 314 |
+
if not turns:
|
| 315 |
+
st.info("No conversation turns found.")
|
| 316 |
+
return
|
| 317 |
+
for i, turn in enumerate(turns, 1):
|
| 318 |
+
role = str(turn.get("role", "unknown")).lower()
|
| 319 |
+
content = str(turn.get("content", "")).strip()
|
| 320 |
+
css_cls = "user" if role == "user" else "assistant" if role in {"assistant", "bot"} else "system"
|
| 321 |
+
st.markdown(
|
| 322 |
+
f"""
|
| 323 |
+
<div class="turn {css_cls}">
|
| 324 |
+
<span class="badge">{role.upper()}</span>
|
| 325 |
+
<span class="smallmono">Turn {i}</span>
|
| 326 |
+
<div style="margin-top:0.35rem; white-space:pre-wrap;">{content.replace(chr(10), '<br>')}</div>
|
| 327 |
+
</div>
|
| 328 |
+
""",
|
| 329 |
+
unsafe_allow_html=True,
|
| 330 |
+
)
|
| 331 |
+
|
| 332 |
+
|
| 333 |
+
def annotation_exists_for_item(df_anns: pd.DataFrame, item_id: str, annotator: str) -> bool:
|
| 334 |
+
if df_anns.empty:
|
| 335 |
+
return False
|
| 336 |
+
sub = df_anns[(df_anns["item_id"] == item_id) & (df_anns["annotator"] == annotator)]
|
| 337 |
+
return not sub.empty
|
| 338 |
+
|
| 339 |
+
|
| 340 |
+
def compute_agreement(df_anns: pd.DataFrame, label_key: str = "assistant_behavior") -> Dict[str, Any]:
|
| 341 |
+
if df_anns.empty:
|
| 342 |
+
return {"paired_items": 0, "raw_agreement": None, "cohen_kappa": None}
|
| 343 |
+
|
| 344 |
+
rows = []
|
| 345 |
+
for _, r in df_anns.iterrows():
|
| 346 |
+
labels = r.get("labels", {}) or {}
|
| 347 |
+
rows.append({"item_id": r["item_id"], "annotator": r["annotator"], label_key: labels.get(label_key)})
|
| 348 |
+
tmp = pd.DataFrame(rows)
|
| 349 |
+
pivot = tmp.pivot_table(index="item_id", columns="annotator", values=label_key, aggfunc="first")
|
| 350 |
+
pivot = pivot.dropna(axis=0, how="any")
|
| 351 |
+
if pivot.shape[0] < 2 or pivot.shape[1] < 2:
|
| 352 |
+
return {"paired_items": int(pivot.shape[0]), "raw_agreement": None, "cohen_kappa": None}
|
| 353 |
+
|
| 354 |
+
from sklearn.metrics import cohen_kappa_score
|
| 355 |
+
|
| 356 |
+
a = pivot.iloc[:, 0].astype(str)
|
| 357 |
+
b = pivot.iloc[:, 1].astype(str)
|
| 358 |
+
return {
|
| 359 |
+
"paired_items": int(pivot.shape[0]),
|
| 360 |
+
"raw_agreement": float((a == b).mean()),
|
| 361 |
+
"cohen_kappa": float(cohen_kappa_score(a, b)),
|
| 362 |
+
}
|
| 363 |
+
|
| 364 |
+
|
| 365 |
+
def push_annotation_to_hub(annotation_repo_id: str, payload: Dict[str, Any]) -> str:
|
| 366 |
+
ensure_repo_exists(annotation_repo_id)
|
| 367 |
+
file_rel_path = annotation_file_name(payload["item_id"], payload["annotator"])
|
| 368 |
+
local_path = LOCAL_DRAFT_DIR / file_rel_path.replace("/", "__")
|
| 369 |
+
local_path.parent.mkdir(parents=True, exist_ok=True)
|
| 370 |
+
with local_path.open("w", encoding="utf-8") as f:
|
| 371 |
+
json.dump(payload, f, ensure_ascii=False, indent=2)
|
| 372 |
+
|
| 373 |
+
api().upload_file(
|
| 374 |
+
path_or_fileobj=str(local_path),
|
| 375 |
+
path_in_repo=file_rel_path,
|
| 376 |
+
repo_id=annotation_repo_id,
|
| 377 |
+
repo_type="dataset",
|
| 378 |
+
token=token(),
|
| 379 |
+
commit_message=f"Add annotation for {payload['item_id']} by {payload['annotator']}",
|
| 380 |
+
)
|
| 381 |
+
return file_rel_path
|
| 382 |
+
|
| 383 |
+
|
| 384 |
+
def get_current_item_id() -> Optional[str]:
|
| 385 |
+
return st.session_state.get("current_item_id")
|
| 386 |
+
|
| 387 |
+
|
| 388 |
+
def set_current_item_id(item_id: Optional[str]) -> None:
|
| 389 |
+
st.session_state["current_item_id"] = item_id
|
| 390 |
+
try:
|
| 391 |
+
st.query_params["item_id"] = item_id or ""
|
| 392 |
+
except Exception:
|
| 393 |
+
pass
|
| 394 |
+
|
| 395 |
+
|
| 396 |
+
def main() -> None:
|
| 397 |
+
st.set_page_config(page_title="LLM Annotation Platform", page_icon="🧭", layout="wide")
|
| 398 |
+
st.markdown(
|
| 399 |
+
"""
|
| 400 |
+
<style>
|
| 401 |
+
.block-container {padding-top: 1rem; padding-bottom: 2rem;}
|
| 402 |
+
.smallmono {font-size: 0.84rem; font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", monospace;}
|
| 403 |
+
.cardbox {
|
| 404 |
+
border: 1px solid rgba(120,120,120,0.22);
|
| 405 |
+
border-radius: 18px;
|
| 406 |
+
padding: 1rem 1rem 0.75rem 1rem;
|
| 407 |
+
background: rgba(255,255,255,0.03);
|
| 408 |
+
}
|
| 409 |
+
.turn {
|
| 410 |
+
border-left: 4px solid rgba(120,120,120,0.45);
|
| 411 |
+
padding: 0.6rem 0.85rem;
|
| 412 |
+
margin: 0.55rem 0;
|
| 413 |
+
border-radius: 0.6rem;
|
| 414 |
+
background: rgba(128,128,128,0.06);
|
| 415 |
+
}
|
| 416 |
+
.turn.user {border-left-color: #8b5cf6;}
|
| 417 |
+
.turn.assistant, .turn.bot {border-left-color: #06b6d4;}
|
| 418 |
+
.turn.system {border-left-color: #f59e0b;}
|
| 419 |
+
.badge {
|
| 420 |
+
display:inline-block; padding:0.18rem 0.5rem; border-radius: 999px;
|
| 421 |
+
background: rgba(120,120,120,0.16); margin-right: 0.35rem; font-size: 0.78rem;
|
| 422 |
+
}
|
| 423 |
+
hr {margin: 0.7rem 0 0.9rem 0;}
|
| 424 |
+
</style>
|
| 425 |
+
""",
|
| 426 |
+
unsafe_allow_html=True,
|
| 427 |
+
)
|
| 428 |
+
|
| 429 |
+
st.title(APP_TITLE)
|
| 430 |
+
st.caption("A Hugging Face–native annotation tool for multi-turn distractors, inter-rater review, and dataset versioning.")
|
| 431 |
+
|
| 432 |
+
if "annotator" not in st.session_state:
|
| 433 |
+
st.session_state["annotator"] = "annotator_1"
|
| 434 |
+
if "current_item_id" not in st.session_state:
|
| 435 |
+
st.session_state["current_item_id"] = None
|
| 436 |
+
if "source_records" not in st.session_state:
|
| 437 |
+
st.session_state["source_records"] = None
|
| 438 |
+
if "source_index" not in st.session_state:
|
| 439 |
+
st.session_state["source_index"] = None
|
| 440 |
+
if "annotations_df" not in st.session_state:
|
| 441 |
+
st.session_state["annotations_df"] = None
|
| 442 |
+
if "draft_loaded" not in st.session_state:
|
| 443 |
+
st.session_state["draft_loaded"] = False
|
| 444 |
+
|
| 445 |
+
with st.sidebar:
|
| 446 |
+
st.header("Workspace")
|
| 447 |
+
annotator = st.text_input("Annotator name", value=st.session_state["annotator"])
|
| 448 |
+
st.session_state["annotator"] = annotator.strip() or "annotator_1"
|
| 449 |
+
|
| 450 |
+
source_repo = st.text_input("Source dataset repo", value=DEFAULT_SOURCE_DATASET)
|
| 451 |
+
source_split = st.text_input("Source split", value=DEFAULT_SOURCE_SPLIT)
|
| 452 |
+
annotation_repo = st.text_input("Annotation dataset repo", value=DEFAULT_ANNOTATION_REPO)
|
| 453 |
+
|
| 454 |
+
st.divider()
|
| 455 |
+
st.caption("HF token is needed only for upload / repo creation.")
|
| 456 |
+
st.write("HF token present:", "yes" if token() else "no")
|
| 457 |
+
st.write("Cache:", str(DEFAULT_CACHE_DIR))
|
| 458 |
+
st.write("Drafts:", str(LOCAL_DRAFT_DIR))
|
| 459 |
+
|
| 460 |
+
if st.button("Reload Hub data", use_container_width=True):
|
| 461 |
+
st.session_state["source_records"] = None
|
| 462 |
+
st.session_state["source_index"] = None
|
| 463 |
+
st.session_state["annotations_df"] = None
|
| 464 |
+
st.rerun()
|
| 465 |
+
|
| 466 |
+
page = st.radio("Page", ["Annotate", "Review", "Dashboard", "Export"], index=0)
|
| 467 |
+
|
| 468 |
+
if st.session_state["source_records"] is None:
|
| 469 |
+
with st.spinner("Loading source dataset from the Hub..."):
|
| 470 |
+
source_records = load_source_dataset(source_repo, source_split)
|
| 471 |
+
samples_df, items_df = seed_source_index(source_records)
|
| 472 |
+
st.session_state["source_records"] = source_records
|
| 473 |
+
st.session_state["source_index"] = {"samples_df": samples_df, "items_df": items_df}
|
| 474 |
+
|
| 475 |
+
if st.session_state["annotations_df"] is None:
|
| 476 |
+
with st.spinner("Loading annotations from the annotation dataset repo..."):
|
| 477 |
+
try:
|
| 478 |
+
anns_df = load_all_hub_annotations(annotation_repo)
|
| 479 |
+
except Exception as e:
|
| 480 |
+
anns_df = pd.DataFrame(columns=["item_id", "sample_id", "annotator", "labels", "notes", "status", "created_at", "file_path"])
|
| 481 |
+
st.warning(f"Could not load annotations from Hub yet: {e}")
|
| 482 |
+
st.session_state["annotations_df"] = anns_df
|
| 483 |
+
|
| 484 |
+
samples_df = st.session_state["source_index"]["samples_df"]
|
| 485 |
+
items_df = st.session_state["source_index"]["items_df"]
|
| 486 |
+
anns_df = st.session_state["annotations_df"]
|
| 487 |
+
|
| 488 |
+
if not st.session_state["draft_loaded"]:
|
| 489 |
+
try:
|
| 490 |
+
q_item = st.query_params.get("item_id")
|
| 491 |
+
except Exception:
|
| 492 |
+
q_item = None
|
| 493 |
+
if q_item:
|
| 494 |
+
st.session_state["current_item_id"] = q_item
|
| 495 |
+
draft = load_draft(st.session_state["annotator"])
|
| 496 |
+
if draft.get("current_item_id") and not st.session_state["current_item_id"]:
|
| 497 |
+
st.session_state["current_item_id"] = draft["current_item_id"]
|
| 498 |
+
st.session_state["draft_loaded"] = True
|
| 499 |
+
|
| 500 |
+
my_annotated_item_ids = set(
|
| 501 |
+
anns_df.loc[anns_df["annotator"] == st.session_state["annotator"], "item_id"].dropna().astype(str).tolist()
|
| 502 |
+
) if not anns_df.empty else set()
|
| 503 |
+
|
| 504 |
+
def current_item_row() -> Optional[Dict[str, Any]]:
|
| 505 |
+
item_id = get_current_item_id()
|
| 506 |
+
if not item_id:
|
| 507 |
+
return None
|
| 508 |
+
match = items_df[items_df["item_id"] == item_id]
|
| 509 |
+
if match.empty:
|
| 510 |
+
return None
|
| 511 |
+
row = match.iloc[0].to_dict()
|
| 512 |
+
sample = samples_df[samples_df["sample_id"] == row["sample_id"]]
|
| 513 |
+
if not sample.empty:
|
| 514 |
+
row.update(sample.iloc[0].to_dict())
|
| 515 |
+
return row
|
| 516 |
+
|
| 517 |
+
def queue_df() -> pd.DataFrame:
|
| 518 |
+
return items_df[~items_df["item_id"].astype(str).isin(my_annotated_item_ids)].copy()
|
| 519 |
+
|
| 520 |
+
if page == "Annotate":
|
| 521 |
+
st.subheader("Annotate a distractor item")
|
| 522 |
+
left, right = st.columns([1.05, 0.95], gap="large")
|
| 523 |
+
|
| 524 |
+
with left:
|
| 525 |
+
top_a, top_b, top_c = st.columns([1, 1, 1])
|
| 526 |
+
with top_a:
|
| 527 |
+
if st.button("Claim next item", use_container_width=True):
|
| 528 |
+
q = queue_df()
|
| 529 |
+
if q.empty:
|
| 530 |
+
st.warning("No remaining items in your queue.")
|
| 531 |
+
else:
|
| 532 |
+
set_current_item_id(q.iloc[0]["item_id"])
|
| 533 |
+
st.rerun()
|
| 534 |
+
with top_b:
|
| 535 |
+
if st.button("Reload annotations from Hub", use_container_width=True):
|
| 536 |
+
st.session_state["annotations_df"] = load_all_hub_annotations(annotation_repo)
|
| 537 |
+
st.rerun()
|
| 538 |
+
with top_c:
|
| 539 |
+
if st.button("Clear current", use_container_width=True):
|
| 540 |
+
set_current_item_id(None)
|
| 541 |
+
st.rerun()
|
| 542 |
+
|
| 543 |
+
item = current_item_row()
|
| 544 |
+
if item is None:
|
| 545 |
+
st.info("Claim an item to start. The app keeps a per-annotator queue so multiple people can work in parallel.")
|
| 546 |
+
q = queue_df().head(10)
|
| 547 |
+
if not q.empty:
|
| 548 |
+
display = q[["item_id", "sample_id", "domain", "scenario", "distractor_index"]].copy()
|
| 549 |
+
display["preview"] = q["distractor_text"].map(preview_text)
|
| 550 |
+
st.dataframe(display, use_container_width=True, hide_index=True)
|
| 551 |
+
return
|
| 552 |
+
|
| 553 |
+
st.markdown(
|
| 554 |
+
f"""
|
| 555 |
+
<div class="cardbox">
|
| 556 |
+
<div><span class="badge">Domain</span> {item.get("domain", "")}</div>
|
| 557 |
+
<div style="margin-top:0.35rem;"><span class="badge">Scenario</span> {item.get("scenario", "")}</div>
|
| 558 |
+
<div style="margin-top:0.35rem;"><span class="badge">Sample</span> <span class="smallmono">{item.get("sample_id", "")}</span></div>
|
| 559 |
+
<div style="margin-top:0.35rem;"><span class="badge">Item</span> <span class="smallmono">{item.get("item_id", "")}</span></div>
|
| 560 |
+
</div>
|
| 561 |
+
""",
|
| 562 |
+
unsafe_allow_html=True,
|
| 563 |
+
)
|
| 564 |
+
st.divider()
|
| 565 |
+
|
| 566 |
+
tabs = st.tabs(["Context", "Distractor", "Existing annotations"])
|
| 567 |
+
with tabs[0]:
|
| 568 |
+
st.markdown("**System instruction**")
|
| 569 |
+
st.code(item.get("system_instruction", ""), language="text")
|
| 570 |
+
st.markdown("**Conversation**")
|
| 571 |
+
render_turns(json.loads(item.get("conversation_json", "[]")))
|
| 572 |
+
with tabs[1]:
|
| 573 |
+
st.markdown("**Previous assistant turn**")
|
| 574 |
+
st.code(item.get("bot_turn", "") or "(missing)", language="text")
|
| 575 |
+
st.markdown("**Distractor user turn**")
|
| 576 |
+
st.code(item.get("distractor_text", "") or "(missing)", language="text")
|
| 577 |
+
with tabs[2]:
|
| 578 |
+
existing = anns_df[anns_df["item_id"] == item["item_id"]].copy()
|
| 579 |
+
if existing.empty:
|
| 580 |
+
st.caption("No annotations yet.")
|
| 581 |
+
else:
|
| 582 |
+
for _, row in existing.iterrows():
|
| 583 |
+
st.write(f"**{row['annotator']}** · {row['status']} · {row['created_at']}")
|
| 584 |
+
st.json(row["labels"])
|
| 585 |
+
if row.get("notes"):
|
| 586 |
+
st.caption(row["notes"])
|
| 587 |
+
st.divider()
|
| 588 |
+
|
| 589 |
+
with right:
|
| 590 |
+
st.markdown("### Annotation form")
|
| 591 |
+
current_draft = load_draft(st.session_state["annotator"])
|
| 592 |
+
draft_labels = current_draft.get("labels", {}) if current_draft else {}
|
| 593 |
+
|
| 594 |
+
with st.form("annotation_form", clear_on_submit=False):
|
| 595 |
+
st.selectbox(
|
| 596 |
+
"Distractor kind",
|
| 597 |
+
LABEL_OPTIONS["distractor_kind"],
|
| 598 |
+
index=LABEL_OPTIONS["distractor_kind"].index(draft_labels.get("distractor_kind", LABEL_OPTIONS["distractor_kind"][0]))
|
| 599 |
+
if draft_labels.get("distractor_kind") in LABEL_OPTIONS["distractor_kind"]
|
| 600 |
+
else 0,
|
| 601 |
+
key="distractor_kind",
|
| 602 |
+
)
|
| 603 |
+
st.selectbox(
|
| 604 |
+
"Transition style",
|
| 605 |
+
LABEL_OPTIONS["transition_style"],
|
| 606 |
+
index=LABEL_OPTIONS["transition_style"].index(draft_labels.get("transition_style", LABEL_OPTIONS["transition_style"][0]))
|
| 607 |
+
if draft_labels.get("transition_style") in LABEL_OPTIONS["transition_style"]
|
| 608 |
+
else 0,
|
| 609 |
+
key="transition_style",
|
| 610 |
+
)
|
| 611 |
+
st.multiselect(
|
| 612 |
+
"Policy target(s)",
|
| 613 |
+
LABEL_OPTIONS["policy_target"],
|
| 614 |
+
default=draft_labels.get("policy_target", []),
|
| 615 |
+
key="policy_target",
|
| 616 |
+
)
|
| 617 |
+
c1, c2 = st.columns(2)
|
| 618 |
+
with c1:
|
| 619 |
+
st.slider("Difficulty", 1, 5, value=int(draft_labels.get("difficulty", 3)), key="difficulty")
|
| 620 |
+
st.slider("Realism", 1, 5, value=int(draft_labels.get("realism", 3)), key="realism")
|
| 621 |
+
with c2:
|
| 622 |
+
st.selectbox(
|
| 623 |
+
"Assistant behavior",
|
| 624 |
+
LABEL_OPTIONS["assistant_behavior"],
|
| 625 |
+
index=LABEL_OPTIONS["assistant_behavior"].index(draft_labels.get("assistant_behavior", LABEL_OPTIONS["assistant_behavior"][0]))
|
| 626 |
+
if draft_labels.get("assistant_behavior") in LABEL_OPTIONS["assistant_behavior"]
|
| 627 |
+
else 0,
|
| 628 |
+
key="assistant_behavior",
|
| 629 |
+
)
|
| 630 |
+
st.slider("Confidence", 1, 5, value=int(draft_labels.get("confidence", 3)), key="confidence")
|
| 631 |
+
|
| 632 |
+
st.checkbox(
|
| 633 |
+
"Multi-turn escalation / persistence",
|
| 634 |
+
value=bool(draft_labels.get("multi_turn_escalation", False)),
|
| 635 |
+
key="multi_turn_escalation",
|
| 636 |
+
)
|
| 637 |
+
st.checkbox(
|
| 638 |
+
"Assistant followed the rule",
|
| 639 |
+
value=bool(draft_labels.get("rule_followed", True)),
|
| 640 |
+
key="rule_followed",
|
| 641 |
+
)
|
| 642 |
+
st.checkbox(
|
| 643 |
+
"Borderline / needs review",
|
| 644 |
+
value=bool(draft_labels.get("needs_review", False)),
|
| 645 |
+
key="needs_review",
|
| 646 |
+
)
|
| 647 |
+
notes = st.text_area(
|
| 648 |
+
"Notes",
|
| 649 |
+
value=current_draft.get("notes", ""),
|
| 650 |
+
height=150,
|
| 651 |
+
placeholder="Explain ambiguity, likely disagreement, or policy edge cases.",
|
| 652 |
+
)
|
| 653 |
+
submitted = st.form_submit_button("Submit to Hugging Face", use_container_width=True)
|
| 654 |
+
|
| 655 |
+
c1, c2 = st.columns(2)
|
| 656 |
+
with c1:
|
| 657 |
+
if st.button("Save draft locally", use_container_width=True):
|
| 658 |
+
payload = {
|
| 659 |
+
"current_item_id": item["item_id"],
|
| 660 |
+
"labels": build_labels_from_state(),
|
| 661 |
+
"notes": notes,
|
| 662 |
+
"saved_at": now_iso(),
|
| 663 |
+
}
|
| 664 |
+
path = save_draft(st.session_state["annotator"], payload)
|
| 665 |
+
st.success(f"Draft saved to {path}")
|
| 666 |
+
with c2:
|
| 667 |
+
if st.button("Sync annotation cache", use_container_width=True):
|
| 668 |
+
st.session_state["annotations_df"] = load_all_hub_annotations(annotation_repo)
|
| 669 |
+
st.success("Reloaded annotation index from Hub.")
|
| 670 |
+
|
| 671 |
+
if submitted:
|
| 672 |
+
labels = build_labels_from_state()
|
| 673 |
+
payload = {
|
| 674 |
+
"annotation_id": str(uuid.uuid4()),
|
| 675 |
+
"item_id": item["item_id"],
|
| 676 |
+
"sample_id": item["sample_id"],
|
| 677 |
+
"annotator": st.session_state["annotator"],
|
| 678 |
+
"created_at": now_iso(),
|
| 679 |
+
"status": "submitted",
|
| 680 |
+
"labels": labels,
|
| 681 |
+
"notes": notes,
|
| 682 |
+
"source": {
|
| 683 |
+
"source_dataset_repo": source_repo,
|
| 684 |
+
"source_dataset_split": source_split,
|
| 685 |
+
"domain": item.get("domain", ""),
|
| 686 |
+
"scenario": item.get("scenario", ""),
|
| 687 |
+
"distractor_index": int(item.get("distractor_index", 0)),
|
| 688 |
+
},
|
| 689 |
+
}
|
| 690 |
+
try:
|
| 691 |
+
path_in_repo = push_annotation_to_hub(annotation_repo, payload)
|
| 692 |
+
st.session_state["annotations_df"] = pd.concat(
|
| 693 |
+
[
|
| 694 |
+
anns_df,
|
| 695 |
+
pd.DataFrame(
|
| 696 |
+
[
|
| 697 |
+
{
|
| 698 |
+
"item_id": payload["item_id"],
|
| 699 |
+
"sample_id": payload["sample_id"],
|
| 700 |
+
"annotator": payload["annotator"],
|
| 701 |
+
"labels": payload["labels"],
|
| 702 |
+
"notes": payload["notes"],
|
| 703 |
+
"status": payload["status"],
|
| 704 |
+
"created_at": payload["created_at"],
|
| 705 |
+
"file_path": path_in_repo,
|
| 706 |
+
}
|
| 707 |
+
]
|
| 708 |
+
),
|
| 709 |
+
],
|
| 710 |
+
ignore_index=True,
|
| 711 |
+
)
|
| 712 |
+
save_draft(
|
| 713 |
+
st.session_state["annotator"],
|
| 714 |
+
{
|
| 715 |
+
"current_item_id": item["item_id"],
|
| 716 |
+
"labels": labels,
|
| 717 |
+
"notes": notes,
|
| 718 |
+
"saved_at": now_iso(),
|
| 719 |
+
},
|
| 720 |
+
)
|
| 721 |
+
st.success(f"Submitted to Hugging Face as {path_in_repo}")
|
| 722 |
+
q = queue_df()
|
| 723 |
+
if not q.empty:
|
| 724 |
+
set_current_item_id(q.iloc[0]["item_id"])
|
| 725 |
+
st.rerun()
|
| 726 |
+
except Exception as e:
|
| 727 |
+
st.error(f"Upload failed. Saved locally only. Error: {e}")
|
| 728 |
+
save_draft(
|
| 729 |
+
st.session_state["annotator"],
|
| 730 |
+
{
|
| 731 |
+
"current_item_id": item["item_id"],
|
| 732 |
+
"labels": labels,
|
| 733 |
+
"notes": notes,
|
| 734 |
+
"saved_at": now_iso(),
|
| 735 |
+
},
|
| 736 |
+
)
|
| 737 |
+
|
| 738 |
+
st.caption("Each submission is a separate file in the annotation dataset repo, so multiple annotators can work in parallel without write conflicts.")
|
| 739 |
+
|
| 740 |
+
elif page == "Review":
|
| 741 |
+
st.subheader("Inter-rater review")
|
| 742 |
+
multi = (
|
| 743 |
+
anns_df.groupby("item_id")["annotator"].nunique().reset_index(name="n_annotators")
|
| 744 |
+
if not anns_df.empty
|
| 745 |
+
else pd.DataFrame(columns=["item_id", "n_annotators"])
|
| 746 |
+
)
|
| 747 |
+
multi = multi[multi["n_annotators"] >= 2] if not multi.empty else multi
|
| 748 |
+
|
| 749 |
+
if multi.empty:
|
| 750 |
+
st.info("No items with at least two annotations yet.")
|
| 751 |
+
else:
|
| 752 |
+
selected_item = st.selectbox("Item with multiple annotations", multi["item_id"].tolist())
|
| 753 |
+
row = items_df[items_df["item_id"] == selected_item].iloc[0].to_dict()
|
| 754 |
+
sample = samples_df[samples_df["sample_id"] == row["sample_id"]].iloc[0].to_dict()
|
| 755 |
+
row.update(sample)
|
| 756 |
+
|
| 757 |
+
st.markdown("### Context")
|
| 758 |
+
st.code(row["system_instruction"], language="text")
|
| 759 |
+
st.code(row["bot_turn"] or "", language="text")
|
| 760 |
+
st.code(row["distractor_text"] or "", language="text")
|
| 761 |
+
|
| 762 |
+
st.markdown("### Annotations")
|
| 763 |
+
sub = anns_df[anns_df["item_id"] == selected_item].copy()
|
| 764 |
+
cols = st.columns(min(len(sub), 3)) if len(sub) > 0 else st.columns(1)
|
| 765 |
+
for idx, (_, ann) in enumerate(sub.iterrows()):
|
| 766 |
+
with cols[idx % len(cols)]:
|
| 767 |
+
st.write(f"**{ann['annotator']}**")
|
| 768 |
+
st.caption(f"{ann['status']} · {ann['created_at']}")
|
| 769 |
+
st.json(ann["labels"])
|
| 770 |
+
if ann.get("notes"):
|
| 771 |
+
st.caption(ann["notes"])
|
| 772 |
+
|
| 773 |
+
agreement = compute_agreement(sub, label_key="assistant_behavior")
|
| 774 |
+
c1, c2, c3 = st.columns(3)
|
| 775 |
+
c1.metric("Paired items", agreement["paired_items"])
|
| 776 |
+
c2.metric("Raw agreement", f"{agreement['raw_agreement']:.2%}" if agreement["raw_agreement"] is not None else "n/a")
|
| 777 |
+
c3.metric("Cohen's κ", f"{agreement['cohen_kappa']:.3f}" if agreement["cohen_kappa"] is not None else "n/a")
|
| 778 |
+
|
| 779 |
+
elif page == "Dashboard":
|
| 780 |
+
st.subheader("Dashboard")
|
| 781 |
+
c1, c2, c3, c4 = st.columns(4)
|
| 782 |
+
c1.metric("Source samples", len(samples_df))
|
| 783 |
+
c2.metric("Source items", len(items_df))
|
| 784 |
+
c3.metric("Annotation files", len(anns_df))
|
| 785 |
+
c4.metric("My queue", len(queue_df()))
|
| 786 |
+
|
| 787 |
+
st.markdown("### Progress by annotator")
|
| 788 |
+
if anns_df.empty:
|
| 789 |
+
st.info("No annotations yet.")
|
| 790 |
+
else:
|
| 791 |
+
by_ann = anns_df.groupby("annotator")["item_id"].nunique().reset_index(name="annotated_items").sort_values("annotated_items", ascending=False)
|
| 792 |
+
st.dataframe(by_ann, use_container_width=True, hide_index=True)
|
| 793 |
+
|
| 794 |
+
st.markdown("### Progress by domain")
|
| 795 |
+
joined = anns_df.merge(items_df[["item_id", "domain"]], on="item_id", how="left")
|
| 796 |
+
by_domain = joined.groupby("domain")["item_id"].nunique().reset_index(name="annotated_items").sort_values("annotated_items", ascending=False)
|
| 797 |
+
st.dataframe(by_domain, use_container_width=True, hide_index=True)
|
| 798 |
+
|
| 799 |
+
st.markdown("### Agreement snapshot")
|
| 800 |
+
metric = compute_agreement(anns_df, label_key="assistant_behavior")
|
| 801 |
+
st.write(metric)
|
| 802 |
+
|
| 803 |
+
st.markdown("### Recent annotation previews")
|
| 804 |
+
recent = anns_df.sort_values("created_at", ascending=False).head(20).copy()
|
| 805 |
+
if "labels" in recent.columns:
|
| 806 |
+
recent["assistant_behavior"] = recent["labels"].apply(lambda x: x.get("assistant_behavior") if isinstance(x, dict) else None)
|
| 807 |
+
recent["distractor_kind"] = recent["labels"].apply(lambda x: x.get("distractor_kind") if isinstance(x, dict) else None)
|
| 808 |
+
st.dataframe(
|
| 809 |
+
recent[["annotator", "item_id", "status", "created_at", "assistant_behavior", "distractor_kind", "notes"]],
|
| 810 |
+
use_container_width=True,
|
| 811 |
+
hide_index=True,
|
| 812 |
+
)
|
| 813 |
+
|
| 814 |
+
else:
|
| 815 |
+
st.subheader("Export")
|
| 816 |
+
st.write("Export the merged dataset for downstream analysis or model training.")
|
| 817 |
+
|
| 818 |
+
merged = items_df.merge(samples_df, on="sample_id", how="left")
|
| 819 |
+
if not anns_df.empty:
|
| 820 |
+
export_df = merged.merge(anns_df[["item_id", "annotator", "labels", "notes", "status", "created_at"]], on="item_id", how="left")
|
| 821 |
+
else:
|
| 822 |
+
export_df = merged.copy()
|
| 823 |
+
export_df["annotator"] = None
|
| 824 |
+
export_df["labels"] = None
|
| 825 |
+
export_df["notes"] = None
|
| 826 |
+
export_df["status"] = None
|
| 827 |
+
export_df["created_at"] = None
|
| 828 |
+
|
| 829 |
+
c1, c2 = st.columns(2)
|
| 830 |
+
with c1:
|
| 831 |
+
jsonl = LOCAL_EXPORT_DIR / "annotations_export.jsonl"
|
| 832 |
+
if st.button("Generate JSONL export", use_container_width=True):
|
| 833 |
+
with jsonl.open("w", encoding="utf-8") as f:
|
| 834 |
+
for _, r in export_df.iterrows():
|
| 835 |
+
f.write(json.dumps(r.where(pd.notna(r), None).to_dict(), ensure_ascii=False) + "\n")
|
| 836 |
+
st.success(f"Wrote {jsonl}")
|
| 837 |
+
st.download_button("Download JSONL", jsonl.read_text(encoding="utf-8"), file_name=jsonl.name, mime="application/json")
|
| 838 |
+
with c2:
|
| 839 |
+
csv = LOCAL_EXPORT_DIR / "annotations_export.csv"
|
| 840 |
+
if st.button("Generate CSV export", use_container_width=True):
|
| 841 |
+
export_df.to_csv(csv, index=False)
|
| 842 |
+
st.success(f"Wrote {csv}")
|
| 843 |
+
st.download_button("Download CSV", csv.read_text(encoding="utf-8"), file_name=csv.name, mime="text/csv")
|
| 844 |
+
|
| 845 |
+
st.markdown("### Repository handoff")
|
| 846 |
+
st.code(
|
| 847 |
+
f"Source repo: {source_repo}\nAnnotation repo: {annotation_repo}\nSplit: {source_split}\nAnnotator: {st.session_state['annotator']}",
|
| 848 |
+
language="text",
|
| 849 |
+
)
|
| 850 |
+
|
| 851 |
+
|
| 852 |
+
if __name__ == "__main__":
|
| 853 |
+
main()
|
hf-space/hf-space/hf-space/hf-space/hf-space/hf-space/.gitattributes
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
hf-space/hf-space/hf-space/hf-space/hf-space/hf-space/README.md
ADDED
|
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: Llm Annotation Platform
|
| 3 |
+
emoji: 🦀
|
| 4 |
+
colorFrom: indigo
|
| 5 |
+
colorTo: blue
|
| 6 |
+
sdk: docker
|
| 7 |
+
pinned: false
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
hf-space/hf-space/hf-space/hf-space/hf-space/requirements.txt
ADDED
|
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
streamlit>=1.37
|
| 2 |
+
pandas>=2.2
|
| 3 |
+
datasets>=2.21
|
| 4 |
+
huggingface_hub>=0.24
|
| 5 |
+
scikit-learn>=1.5
|
hf-space/hf-space/hf-space/hf-space/hf-space/scripts/seed.py
ADDED
|
@@ -0,0 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import argparse
|
| 4 |
+
|
| 5 |
+
from datasets import load_dataset
|
| 6 |
+
|
| 7 |
+
from app import DEFAULT_ANNOTATION_REPO, DEFAULT_SOURCE_DATASET, DEFAULT_SOURCE_SPLIT
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
def main() -> None:
|
| 11 |
+
parser = argparse.ArgumentParser()
|
| 12 |
+
parser.add_argument("--source", default=DEFAULT_SOURCE_DATASET)
|
| 13 |
+
parser.add_argument("--split", default=DEFAULT_SOURCE_SPLIT)
|
| 14 |
+
parser.add_argument("--annotation-repo", default=DEFAULT_ANNOTATION_REPO)
|
| 15 |
+
parser.add_argument("--limit", type=int, default=0)
|
| 16 |
+
args = parser.parse_args()
|
| 17 |
+
|
| 18 |
+
records = load_dataset(args.source, split=args.split)
|
| 19 |
+
if args.limit:
|
| 20 |
+
records = records.select(range(min(len(records), args.limit)))
|
| 21 |
+
|
| 22 |
+
print(f"Loaded {len(records)} source records from {args.source}/{args.split}")
|
| 23 |
+
print(f"Annotation repo: {args.annotation_repo}")
|
| 24 |
+
print("Open the Streamlit app and submit annotations there.")
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
if __name__ == "__main__":
|
| 28 |
+
main()
|
hf-space/requirements.txt
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
-
streamlit>=1.
|
| 2 |
pandas>=2.2
|
| 3 |
datasets>=2.21
|
| 4 |
huggingface_hub>=0.24
|
| 5 |
-
|
|
|
|
| 1 |
+
streamlit>=1.38
|
| 2 |
pandas>=2.2
|
| 3 |
datasets>=2.21
|
| 4 |
huggingface_hub>=0.24
|
| 5 |
+
openai>=1.40
|