GitHub Actions commited on
Commit
da0bed0
·
1 Parent(s): caad2c6

Sync from GitHub

Browse files
hf-space/.gitignore CHANGED
@@ -2,6 +2,7 @@ __pycache__/
2
  *.pyc
3
  .streamlit/
4
  data/
 
5
  exports/
6
  .env
7
  .DS_Store
 
2
  *.pyc
3
  .streamlit/
4
  data/
5
+ drafts/
6
  exports/
7
  .env
8
  .DS_Store
hf-space/Dockerfile CHANGED
@@ -1,7 +1,6 @@
1
  FROM python:3.11-slim
2
 
3
  WORKDIR /app
4
-
5
  COPY . /app
6
 
7
  RUN pip install --no-cache-dir -r requirements.txt
 
1
  FROM python:3.11-slim
2
 
3
  WORKDIR /app
 
4
  COPY . /app
5
 
6
  RUN pip install --no-cache-dir -r requirements.txt
hf-space/README.md CHANGED
@@ -1,93 +1,72 @@
1
- ---
2
- title: LLM Annotation Platform
3
- emoji: 🧠
4
- colorFrom: blue
5
- colorTo: purple
6
- sdk: docker
7
- pinned: false
8
- ---
9
-
10
- # LLM Annotation Platform — Hugging Face native
11
-
12
- This version removes the external database layer.
13
-
14
- ## What it uses
15
-
16
- - **Hugging Face Space** for the Streamlit app
17
- - **Hugging Face dataset repo** for the canonical annotation store
18
- - **Hugging Face Storage Bucket** only for persistent local cache / drafts in the Space
19
- - **No Supabase**
20
- - **No separate backend platform**
21
-
22
- Hugging Face Spaces provide ephemeral disk by default, and Hugging Face recommends attaching Storage Buckets to persist data across restarts. Buckets are mounted into the Space container as local volumes. citeturn322583view0
23
-
24
- ## Repository structure
25
-
26
- ```text
27
- app.py
28
- scripts/seed.py
29
- requirements.txt
30
- README.md
31
- ```
32
-
33
- ## Behavior
34
-
35
- Each annotation is written as its own JSON file into the dataset repository:
36
- ```text
37
- annotations/<annotator>/<timestamp>_<item_id>_<uuid>.json
38
- ```
39
-
40
- That design avoids write conflicts between annotators because each submission is a new file, not an overwrite of a shared database row. Repository files on the Hub are versioned, and the Hub supports uploading files to dataset repositories. citeturn322583view1turn322583view4
41
-
42
- ## Local run
43
 
44
  ```bash
45
  pip install -r requirements.txt
46
  streamlit run app.py
47
  ```
48
 
49
- ## How to set it up on Hugging Face
50
-
51
- ### 1. Create two dataset repositories
52
-
53
- Create:
54
- - one dataset repo for the **source / seed data**
55
- - one dataset repo for the **annotations**
56
-
57
- Hugging Face dataset repositories are created from the Hub UI, and dataset files plus revision history are stored in the repository. citeturn322583view1
58
-
59
- ### 2. Create a Space
60
-
61
- Create a **Streamlit** Space and connect it to your GitHub repository. Spaces host apps directly on the Hub and support Streamlit as a built-in SDK. citeturn322583view2
62
-
63
- ### 3. Attach a Storage Bucket
64
-
65
- Attach a Storage Bucket to the Space and mount it at `/data`.
66
 
67
- This is the only stateful storage used by the app. It stores drafts and cache files and survives restarts. Hugging Face documents Storage Buckets as the recommended persistence mechanism for Spaces. citeturn322583view0
68
 
69
- ### 4. Add secrets
70
-
71
- In the Space settings, add:
72
- - `HF_TOKEN` — a Hugging Face token with **write** permission
73
  - `SOURCE_DATASET_REPO`
74
- - `SOURCE_DATASET_SPLIT`
 
75
  - `ANNOTATION_REPO_ID`
 
76
 
77
- Hugging Face recommends using Space secrets or environment variables instead of hard-coding sensitive values. A write token is required to create repositories or push content to the Hub. citeturn322583view2turn322583view4
78
-
79
- ### 5. Deploy
80
 
81
- Commit the repo to GitHub. Once the Space is linked, it will build from the repository, and the app can upload annotation files to the dataset repo using the Hub API. Hugging Face’s Hub client supports `upload_file()` and `create_commit()` for repository writes. citeturn322583view3turn322583view4
82
 
83
- ## Suggested workflow for your group
84
 
85
- - each person uses a stable annotator name
86
- - each submission creates a new JSON file in the annotation repo
87
- - the Review page shows items with 2+ annotations
88
- - the Dashboard shows per-annotator and per-domain progress
89
- - exports are generated from the merged source + annotation view
90
 
91
- ## Why this is a good fit
92
-
93
- The original source dataset can still be loaded with `datasets.load_dataset(...)`, and the Hugging Face ecosystem is designed for pushing and versioning datasets directly on the Hub. The `datasets` library also provides a `push_to_hub()` path for dataset publishing, while `huggingface_hub` provides lower-level file upload methods when you want more control over file layout. citeturn674332search1turn674332search3turn322583view3
 
 
 
 
 
1
+ # LLM Annotation Platform
2
+
3
+ A simple Streamlit app for collaborative editing of a human-made distractor dataset.
4
+
5
+ ## What it supports
6
+
7
+ - browse source data from a Hugging Face dataset repo or a local JSON/JSONL file
8
+ - load a row by index into an editor
9
+ - create a new blank entry
10
+ - edit:
11
+ - `domain`
12
+ - `scenario`
13
+ - `system_instruction`
14
+ - `conversation`
15
+ - `distractors`
16
+ - `distractors_multiturn`
17
+ - `conversation_with_distractors`
18
+ - mark the entry with a `split` value (`train` / `test`)
19
+ - save drafts in the HF Space bucket path (`/data/drafts`)
20
+ - submit each finished entry as a separate JSON file to a Hugging Face dataset repo
21
+ - optionally ask a local OpenAI-compatible LLM server such as LM Studio to draft one distractor at a time
22
+
23
+ ## Output shape
24
+
25
+ The app keeps the source structure and adds provenance fields:
26
+
27
+ - `split`
28
+ - `_review_status`
29
+ - `_needs_human_review`
30
+ - `_annotator`
31
+ - `_source_repo`
32
+ - `_source_split`
33
+ - `_source_index`
34
+ - `_created_at`
35
+ - `_updated_at`
36
+
37
+ That means the final file can still be merged into one dataset later.
38
+
39
+ ## Run locally
 
 
 
40
 
41
  ```bash
42
  pip install -r requirements.txt
43
  streamlit run app.py
44
  ```
45
 
46
+ ## Environment variables
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
 
48
+ Set these in your GitHub repo / HF Space:
49
 
 
 
 
 
50
  - `SOURCE_DATASET_REPO`
51
+ - `SOURCE_DATASET_SPLITS`
52
+ Example: `train,test`
53
  - `ANNOTATION_REPO_ID`
54
+ - `HF_TOKEN`
55
 
56
+ Optional local LLM settings:
57
+ - `LLM_BASE_URL` is entered in the sidebar inside the app
58
+ - `LLM_MODEL` is entered in the sidebar inside the app
59
 
60
+ ## HF Space setup
61
 
62
+ Use a Docker Space, mount persistent storage at `/data`, and set the environment variables above. The app stores drafts and submission logs in the bucket path.
63
 
64
+ ## GitHub structure
 
 
 
 
65
 
66
+ ```text
67
+ app.py
68
+ requirements.txt
69
+ README.md
70
+ Dockerfile
71
+ .streamlit/config.toml
72
+ ```
hf-space/hf-space/app.py CHANGED
@@ -3,6 +3,7 @@ from __future__ import annotations
3
  import json
4
  import os
5
  import uuid
 
6
  from datetime import datetime, timezone
7
  from pathlib import Path
8
  from typing import Any, Dict, List, Optional, Tuple
@@ -12,861 +13,1000 @@ import streamlit as st
12
  from datasets import load_dataset
13
  from huggingface_hub import HfApi, hf_hub_download
14
 
 
 
 
 
 
15
  APP_TITLE = "🧭 LLM Annotation Platform"
16
- DEFAULT_SOURCE_DATASET = os.environ.get(
17
- "SOURCE_DATASET_REPO",
18
- "nvidia/CantTalkAboutThis-Topic-Control-Dataset",
19
- )
20
- DEFAULT_SOURCE_SPLIT = os.environ.get("SOURCE_DATASET_SPLIT", "train")
21
- DEFAULT_ANNOTATION_REPO = os.environ.get(
22
- "ANNOTATION_REPO_ID",
23
- "YOUR_ORG/llm-distractor-annotations",
24
- )
25
- DEFAULT_CACHE_DIR = Path(os.environ.get("CACHE_DIR", "/data/hf_annotation_cache"))
26
- DEFAULT_CACHE_DIR.mkdir(parents=True, exist_ok=True)
27
- LOCAL_DRAFT_DIR = Path(os.environ.get("DRAFT_DIR", "/data/hf_annotation_drafts"))
28
- LOCAL_DRAFT_DIR.mkdir(parents=True, exist_ok=True)
29
- LOCAL_EXPORT_DIR = Path(os.environ.get("EXPORT_DIR", "/data/hf_annotation_exports"))
30
- LOCAL_EXPORT_DIR.mkdir(parents=True, exist_ok=True)
31
-
32
- LABEL_OPTIONS = {
33
- "distractor_kind": [
34
- "benign off-topic",
35
- "smooth bridge",
36
- "policy-evasive",
37
- "roleplay / impersonation",
38
- "pressure / persistence",
39
- "urgency / time pressure",
40
- "loophole seeking",
41
- "multi-turn escalation",
42
- "other",
43
- ],
44
- "assistant_behavior": [
45
- "perfect refusal + redirect",
46
- "helpful redirection",
47
- "partial engagement",
48
- "full engagement / derailment",
49
- "over-refusal",
50
- "unclear",
51
- ],
52
- "transition_style": [
53
- "abrupt",
54
- "smooth bridge",
55
- "follow-up clarification",
56
- "rephrasing",
57
- "escalation",
58
- "roleplay",
59
- "ambiguity exploitation",
60
- "other",
61
- ],
62
- "policy_target": [
63
- "medical advice",
64
- "financial advice",
65
- "legal advice",
66
- "competitor discussion",
67
- "politics",
68
- "unsafe content",
69
- "personal data",
70
- "company-specific policy",
71
- "tone / style policy",
72
- "other",
73
- ],
74
  }
75
 
 
 
 
76
 
77
  def now_iso() -> str:
78
- return datetime.now(timezone.utc).isoformat()
79
 
80
 
81
- def token() -> Optional[str]:
82
- return os.environ.get("HF_TOKEN") or os.environ.get("HUGGINGFACE_HUB_TOKEN")
 
 
 
83
 
84
 
85
- def api() -> HfApi:
86
- return HfApi(token=token())
 
 
 
87
 
88
 
89
- def annotation_file_name(item_id: str, annotator: str) -> str:
90
- safe_annotator = "".join(ch if ch.isalnum() or ch in "-_." else "_" for ch in annotator.strip().lower()) or "annotator"
91
- safe_item = "".join(ch if ch.isalnum() or ch in "-_." else "_" for ch in item_id.strip()) or "item"
92
- stamp = datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ")
93
- return f"annotations/{safe_annotator}/{stamp}_{safe_item}_{uuid.uuid4().hex[:8]}.json"
94
 
95
 
96
- def draft_path(annotator: str) -> Path:
97
- safe_annotator = "".join(ch if ch.isalnum() or ch in "-_." else "_" for ch in annotator.strip().lower()) or "annotator"
98
- return LOCAL_DRAFT_DIR / f"{safe_annotator}.json"
 
 
 
99
 
100
 
101
- def cache_annotations_dir() -> Path:
102
- path = DEFAULT_CACHE_DIR / "annotations_snapshot"
103
- path.mkdir(parents=True, exist_ok=True)
104
- return path
 
105
 
106
 
107
- def ensure_repo_exists(repo_id: str) -> None:
108
- if repo_id.startswith("YOUR_ORG/") or not repo_id.strip():
109
- return
110
- api().create_repo(repo_id=repo_id, repo_type="dataset", private=True, exist_ok=True)
 
 
 
 
 
 
 
 
 
 
111
 
112
 
113
- def load_source_dataset(repo_id: str, split: str) -> List[Dict[str, Any]]:
114
- ds = load_dataset(repo_id, split=split)
115
- return [dict(row) for row in ds]
 
 
 
 
 
 
116
 
117
 
118
- def normalize_turns(turns: Any) -> List[Dict[str, Any]]:
119
- if turns is None:
120
- return []
121
- if isinstance(turns, str):
122
- try:
123
- turns = json.loads(turns)
124
- except Exception:
125
- return []
126
- if not isinstance(turns, list):
127
- return []
128
  out = []
129
- for turn in turns:
130
- if isinstance(turn, dict):
131
- role = turn.get("role") or turn.get("speaker") or turn.get("type") or "unknown"
132
- content = turn.get("content") or turn.get("text") or turn.get("utterance") or ""
133
- out.append({"role": str(role), "content": str(content)})
134
- else:
135
- out.append({"role": "unknown", "content": str(turn)})
136
  return out
137
 
138
 
139
- def safe_sample_id(record: Dict[str, Any], fallback_index: int) -> str:
140
- for key in ("sample_id", "id", "_id", "row_id"):
141
- if record.get(key) not in (None, ""):
142
- return str(record[key])
143
- domain = str(record.get("domain", "sample")).replace(" ", "_")
144
- scenario = str(record.get("scenario", "")).replace(" ", "_")
145
- return f"{domain}-{scenario}-{fallback_index}"
 
 
 
 
 
 
 
146
 
147
 
148
- def expand_record(record: Dict[str, Any], idx: int) -> Tuple[Dict[str, Any], List[Dict[str, Any]]]:
149
- sample_id = safe_sample_id(record, idx)
150
- conversation = normalize_turns(record.get("conversation"))
151
- distractors = record.get("distractors") or []
152
- if isinstance(distractors, str):
153
- try:
154
- distractors = json.loads(distractors)
155
- except Exception:
156
- distractors = []
157
- if not isinstance(distractors, list):
158
- distractors = []
159
-
160
- sample = {
161
- "sample_id": sample_id,
162
- "domain": str(record.get("domain", "")),
163
- "scenario": str(record.get("scenario", "")),
164
- "system_instruction": str(record.get("system_instruction", "")),
165
- "conversation_json": json.dumps(conversation, ensure_ascii=False),
166
- "distractors_json": json.dumps(distractors, ensure_ascii=False),
167
- "conversation_with_distractors_json": json.dumps(record.get("conversation_with_distractors", []), ensure_ascii=False),
168
- "raw_json": json.dumps(record, ensure_ascii=False),
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
169
  }
 
170
 
171
- items = []
172
- for distractor_index, d in enumerate(distractors):
173
- bot_turn = ""
174
- distractor_text = ""
175
- if isinstance(d, dict):
176
- bot_turn = str(
177
- d.get("bot turn")
178
- or d.get("bot_turn")
179
- or d.get("assistant_turn")
180
- or d.get("assistant")
181
- or ""
182
- )
183
- distractor_text = str(
184
- d.get("distractor")
185
- or d.get("distractor user turn")
186
- or d.get("user_turn")
187
- or d.get("user")
188
- or d.get("text")
189
- or ""
190
- )
191
- else:
192
- distractor_text = str(d)
193
 
194
- items.append(
195
- {
196
- "item_id": f"{sample_id}::{distractor_index}",
197
- "sample_id": sample_id,
198
- "distractor_index": distractor_index,
199
- "bot_turn": bot_turn,
200
- "distractor_text": distractor_text,
201
- }
202
- )
203
- return sample, items
204
 
205
 
206
- def seed_source_index(records: List[Dict[str, Any]]) -> Tuple[pd.DataFrame, pd.DataFrame]:
207
- samples = []
208
- items = []
209
- for idx, record in enumerate(records):
210
- sample, record_items = expand_record(record, idx)
211
- samples.append(sample)
212
- items.extend(record_items)
213
- return pd.DataFrame(samples), pd.DataFrame(items)
214
 
215
 
216
- def read_json_file(path: Path) -> Dict[str, Any]:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
217
  with path.open("r", encoding="utf-8") as f:
218
- return json.load(f)
 
 
 
 
 
 
 
219
 
220
 
221
- def load_all_hub_annotations(annotation_repo_id: str) -> pd.DataFrame:
222
- """
223
- Each submission is stored as a separate JSON file, which avoids write conflicts.
224
- """
225
- if annotation_repo_id.startswith("YOUR_ORG/") or not annotation_repo_id.strip():
226
- return pd.DataFrame(columns=["item_id", "annotator", "labels", "notes", "status", "created_at", "file_path"])
 
 
227
 
228
- cache_dir = cache_annotations_dir()
229
- file_list = api().list_repo_files(annotation_repo_id, repo_type="dataset")
230
- ann_files = [f for f in file_list if f.startswith("annotations/") and f.endswith(".json")]
231
 
232
- rows = []
233
- for file_path in ann_files:
234
- try:
235
- local_path = hf_hub_download(
236
- repo_id=annotation_repo_id,
237
- repo_type="dataset",
238
- filename=file_path,
239
- token=token(),
240
- local_dir=str(cache_dir),
241
- local_dir_use_symlinks=False,
242
- )
243
- payload = read_json_file(Path(local_path))
244
- rows.append(
245
- {
246
- "item_id": payload.get("item_id", ""),
247
- "sample_id": payload.get("sample_id", ""),
248
- "annotator": payload.get("annotator", ""),
249
- "labels": payload.get("labels", {}),
250
- "notes": payload.get("notes", ""),
251
- "status": payload.get("status", "submitted"),
252
- "created_at": payload.get("created_at", ""),
253
- "file_path": file_path,
254
- }
255
- )
256
- except Exception as e:
257
- rows.append(
258
- {
259
- "item_id": "",
260
- "sample_id": "",
261
- "annotator": "",
262
- "labels": {},
263
- "notes": f"Failed to load {file_path}: {e}",
264
- "status": "load_error",
265
- "created_at": "",
266
- "file_path": file_path,
267
- }
268
- )
 
 
 
 
 
 
269
 
270
- return pd.DataFrame(rows) if rows else pd.DataFrame(columns=["item_id", "sample_id", "annotator", "labels", "notes", "status", "created_at", "file_path"])
271
 
 
 
 
272
 
273
- def save_draft(annotator: str, payload: Dict[str, Any]) -> Path:
274
- path = draft_path(annotator)
275
- path.parent.mkdir(parents=True, exist_ok=True)
 
 
 
 
276
  with path.open("w", encoding="utf-8") as f:
277
  json.dump(payload, f, ensure_ascii=False, indent=2)
278
  return path
279
 
280
 
281
- def load_draft(annotator: str) -> Dict[str, Any]:
282
- path = draft_path(annotator)
283
  if not path.exists():
284
  return {}
285
  try:
286
- return read_json_file(path)
 
287
  except Exception:
288
  return {}
289
 
290
 
291
- def build_labels_from_state(prefix: str = "") -> Dict[str, Any]:
292
- return {
293
- "distractor_kind": st.session_state.get(f"{prefix}distractor_kind", LABEL_OPTIONS["distractor_kind"][0]),
294
- "transition_style": st.session_state.get(f"{prefix}transition_style", LABEL_OPTIONS["transition_style"][0]),
295
- "policy_target": st.session_state.get(f"{prefix}policy_target", []),
296
- "difficulty": int(st.session_state.get(f"{prefix}difficulty", 3)),
297
- "realism": int(st.session_state.get(f"{prefix}realism", 3)),
298
- "assistant_behavior": st.session_state.get(f"{prefix}assistant_behavior", LABEL_OPTIONS["assistant_behavior"][0]),
299
- "multi_turn_escalation": bool(st.session_state.get(f"{prefix}multi_turn_escalation", False)),
300
- "rule_followed": bool(st.session_state.get(f"{prefix}rule_followed", True)),
301
- "needs_review": bool(st.session_state.get(f"{prefix}needs_review", False)),
302
- "confidence": int(st.session_state.get(f"{prefix}confidence", 3)),
303
- }
304
-
305
 
306
- def preview_text(text: str, limit: int = 280) -> str:
307
- txt = (text or "").strip().replace("\n", " ")
308
- if len(txt) <= limit:
309
- return txt
310
- return txt[:limit - 1] + "…"
311
 
 
 
 
312
 
313
- def render_turns(turns: List[Dict[str, Any]]) -> None:
314
  if not turns:
315
- st.info("No conversation turns found.")
316
- return
317
- for i, turn in enumerate(turns, 1):
318
- role = str(turn.get("role", "unknown")).lower()
319
- content = str(turn.get("content", "")).strip()
320
- css_cls = "user" if role == "user" else "assistant" if role in {"assistant", "bot"} else "system"
321
- st.markdown(
322
- f"""
323
- <div class="turn {css_cls}">
324
- <span class="badge">{role.upper()}</span>
325
- <span class="smallmono">Turn {i}</span>
326
- <div style="margin-top:0.35rem; white-space:pre-wrap;">{content.replace(chr(10), '<br>')}</div>
327
- </div>
328
- """,
329
- unsafe_allow_html=True,
330
- )
 
 
 
 
 
331
 
 
 
 
 
 
 
 
 
 
 
332
 
333
- def annotation_exists_for_item(df_anns: pd.DataFrame, item_id: str, annotator: str) -> bool:
334
- if df_anns.empty:
335
- return False
336
- sub = df_anns[(df_anns["item_id"] == item_id) & (df_anns["annotator"] == annotator)]
337
- return not sub.empty
338
 
 
 
 
 
339
 
340
- def compute_agreement(df_anns: pd.DataFrame, label_key: str = "assistant_behavior") -> Dict[str, Any]:
341
- if df_anns.empty:
342
- return {"paired_items": 0, "raw_agreement": None, "cohen_kappa": None}
343
 
344
- rows = []
345
- for _, r in df_anns.iterrows():
346
- labels = r.get("labels", {}) or {}
347
- rows.append({"item_id": r["item_id"], "annotator": r["annotator"], label_key: labels.get(label_key)})
348
- tmp = pd.DataFrame(rows)
349
- pivot = tmp.pivot_table(index="item_id", columns="annotator", values=label_key, aggfunc="first")
350
- pivot = pivot.dropna(axis=0, how="any")
351
- if pivot.shape[0] < 2 or pivot.shape[1] < 2:
352
- return {"paired_items": int(pivot.shape[0]), "raw_agreement": None, "cohen_kappa": None}
353
-
354
- from sklearn.metrics import cohen_kappa_score
355
-
356
- a = pivot.iloc[:, 0].astype(str)
357
- b = pivot.iloc[:, 1].astype(str)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
358
  return {
359
- "paired_items": int(pivot.shape[0]),
360
- "raw_agreement": float((a == b).mean()),
361
- "cohen_kappa": float(cohen_kappa_score(a, b)),
 
 
 
 
 
 
 
 
 
 
 
362
  }
363
 
364
 
365
- def push_annotation_to_hub(annotation_repo_id: str, payload: Dict[str, Any]) -> str:
366
- ensure_repo_exists(annotation_repo_id)
367
- file_rel_path = annotation_file_name(payload["item_id"], payload["annotator"])
368
- local_path = LOCAL_DRAFT_DIR / file_rel_path.replace("/", "__")
369
- local_path.parent.mkdir(parents=True, exist_ok=True)
370
- with local_path.open("w", encoding="utf-8") as f:
371
- json.dump(payload, f, ensure_ascii=False, indent=2)
372
 
373
- api().upload_file(
374
- path_or_fileobj=str(local_path),
375
- path_in_repo=file_rel_path,
376
- repo_id=annotation_repo_id,
377
- repo_type="dataset",
378
- token=token(),
379
- commit_message=f"Add annotation for {payload['item_id']} by {payload['annotator']}",
380
- )
381
- return file_rel_path
382
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
383
 
384
- def get_current_item_id() -> Optional[str]:
385
- return st.session_state.get("current_item_id")
386
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
387
 
388
- def set_current_item_id(item_id: Optional[str]) -> None:
389
- st.session_state["current_item_id"] = item_id
390
  try:
391
- st.query_params["item_id"] = item_id or ""
392
- except Exception:
393
- pass
 
 
 
 
 
 
 
 
 
 
 
 
 
 
394
 
395
 
396
- def main() -> None:
397
- st.set_page_config(page_title="LLM Annotation Platform", page_icon="🧭", layout="wide")
398
- st.markdown(
399
- """
400
- <style>
401
- .block-container {padding-top: 1rem; padding-bottom: 2rem;}
402
- .smallmono {font-size: 0.84rem; font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", monospace;}
403
- .cardbox {
404
- border: 1px solid rgba(120,120,120,0.22);
405
- border-radius: 18px;
406
- padding: 1rem 1rem 0.75rem 1rem;
407
- background: rgba(255,255,255,0.03);
408
- }
409
- .turn {
410
- border-left: 4px solid rgba(120,120,120,0.45);
411
- padding: 0.6rem 0.85rem;
412
- margin: 0.55rem 0;
413
- border-radius: 0.6rem;
414
- background: rgba(128,128,128,0.06);
415
- }
416
- .turn.user {border-left-color: #8b5cf6;}
417
- .turn.assistant, .turn.bot {border-left-color: #06b6d4;}
418
- .turn.system {border-left-color: #f59e0b;}
419
- .badge {
420
- display:inline-block; padding:0.18rem 0.5rem; border-radius: 999px;
421
- background: rgba(120,120,120,0.16); margin-right: 0.35rem; font-size: 0.78rem;
422
- }
423
- hr {margin: 0.7rem 0 0.9rem 0;}
424
- </style>
425
- """,
426
- unsafe_allow_html=True,
427
- )
428
 
429
- st.title(APP_TITLE)
430
- st.caption("A Hugging Face–native annotation tool for multi-turn distractors, inter-rater review, and dataset versioning.")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
431
 
432
- if "annotator" not in st.session_state:
433
- st.session_state["annotator"] = "annotator_1"
434
- if "current_item_id" not in st.session_state:
435
- st.session_state["current_item_id"] = None
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
436
  if "source_records" not in st.session_state:
437
  st.session_state["source_records"] = None
438
- if "source_index" not in st.session_state:
439
- st.session_state["source_index"] = None
440
- if "annotations_df" not in st.session_state:
441
- st.session_state["annotations_df"] = None
442
- if "draft_loaded" not in st.session_state:
443
- st.session_state["draft_loaded"] = False
444
-
445
- with st.sidebar:
446
- st.header("Workspace")
447
- annotator = st.text_input("Annotator name", value=st.session_state["annotator"])
448
- st.session_state["annotator"] = annotator.strip() or "annotator_1"
449
-
450
- source_repo = st.text_input("Source dataset repo", value=DEFAULT_SOURCE_DATASET)
451
- source_split = st.text_input("Source split", value=DEFAULT_SOURCE_SPLIT)
452
- annotation_repo = st.text_input("Annotation dataset repo", value=DEFAULT_ANNOTATION_REPO)
453
-
454
- st.divider()
455
- st.caption("HF token is needed only for upload / repo creation.")
456
- st.write("HF token present:", "yes" if token() else "no")
457
- st.write("Cache:", str(DEFAULT_CACHE_DIR))
458
- st.write("Drafts:", str(LOCAL_DRAFT_DIR))
459
-
460
- if st.button("Reload Hub data", use_container_width=True):
461
- st.session_state["source_records"] = None
462
- st.session_state["source_index"] = None
463
- st.session_state["annotations_df"] = None
464
- st.rerun()
465
-
466
- page = st.radio("Page", ["Annotate", "Review", "Dashboard", "Export"], index=0)
467
 
468
  if st.session_state["source_records"] is None:
469
- with st.spinner("Loading source dataset from the Hub..."):
470
- source_records = load_source_dataset(source_repo, source_split)
471
- samples_df, items_df = seed_source_index(source_records)
472
- st.session_state["source_records"] = source_records
473
- st.session_state["source_index"] = {"samples_df": samples_df, "items_df": items_df}
474
-
475
- if st.session_state["annotations_df"] is None:
476
- with st.spinner("Loading annotations from the annotation dataset repo..."):
477
  try:
478
- anns_df = load_all_hub_annotations(annotation_repo)
 
 
 
 
 
 
 
 
 
 
479
  except Exception as e:
480
- anns_df = pd.DataFrame(columns=["item_id", "sample_id", "annotator", "labels", "notes", "status", "created_at", "file_path"])
481
- st.warning(f"Could not load annotations from Hub yet: {e}")
482
- st.session_state["annotations_df"] = anns_df
483
-
484
- samples_df = st.session_state["source_index"]["samples_df"]
485
- items_df = st.session_state["source_index"]["items_df"]
486
- anns_df = st.session_state["annotations_df"]
487
-
488
- if not st.session_state["draft_loaded"]:
489
- try:
490
- q_item = st.query_params.get("item_id")
491
- except Exception:
492
- q_item = None
493
- if q_item:
494
- st.session_state["current_item_id"] = q_item
495
- draft = load_draft(st.session_state["annotator"])
496
- if draft.get("current_item_id") and not st.session_state["current_item_id"]:
497
- st.session_state["current_item_id"] = draft["current_item_id"]
498
- st.session_state["draft_loaded"] = True
499
-
500
- my_annotated_item_ids = set(
501
- anns_df.loc[anns_df["annotator"] == st.session_state["annotator"], "item_id"].dropna().astype(str).tolist()
502
- ) if not anns_df.empty else set()
503
-
504
- def current_item_row() -> Optional[Dict[str, Any]]:
505
- item_id = get_current_item_id()
506
- if not item_id:
507
- return None
508
- match = items_df[items_df["item_id"] == item_id]
509
- if match.empty:
510
- return None
511
- row = match.iloc[0].to_dict()
512
- sample = samples_df[samples_df["sample_id"] == row["sample_id"]]
513
- if not sample.empty:
514
- row.update(sample.iloc[0].to_dict())
515
- return row
516
-
517
- def queue_df() -> pd.DataFrame:
518
- return items_df[~items_df["item_id"].astype(str).isin(my_annotated_item_ids)].copy()
519
-
520
- if page == "Annotate":
521
- st.subheader("Annotate a distractor item")
 
 
 
 
 
 
 
 
 
 
 
522
  left, right = st.columns([1.05, 0.95], gap="large")
523
 
524
  with left:
525
- top_a, top_b, top_c = st.columns([1, 1, 1])
526
- with top_a:
527
- if st.button("Claim next item", use_container_width=True):
528
- q = queue_df()
529
- if q.empty:
530
- st.warning("No remaining items in your queue.")
531
- else:
532
- set_current_item_id(q.iloc[0]["item_id"])
533
- st.rerun()
534
- with top_b:
535
- if st.button("Reload annotations from Hub", use_container_width=True):
536
- st.session_state["annotations_df"] = load_all_hub_annotations(annotation_repo)
537
  st.rerun()
538
- with top_c:
539
- if st.button("Clear current", use_container_width=True):
540
- set_current_item_id(None)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
541
  st.rerun()
542
 
543
- item = current_item_row()
544
- if item is None:
545
- st.info("Claim an item to start. The app keeps a per-annotator queue so multiple people can work in parallel.")
546
-
547
- q = queue_df().head(10)
548
-
549
- # DEBUG: inspect actual dataset schema
550
- st.write("Dataset columns:", list(q.columns))
551
-
552
- if not q.empty:
553
-
554
- # Only use columns that actually exist
555
- available_cols = [
556
- c for c in [
557
- "item_id",
558
- "sample_id",
559
- "domain",
560
- "scenario",
561
- "distractor_index"
562
- ]
563
- if c in q.columns
564
- ]
565
-
566
- display = q[available_cols].copy()
567
-
568
- if "distractor_text" in q.columns:
569
- display["preview"] = q["distractor_text"].map(preview_text)
570
-
571
- st.dataframe(display, use_container_width=True, hide_index=True)
572
-
573
- return
574
-
575
- st.markdown(
576
- f"""
577
- <div class="cardbox">
578
- <div><span class="badge">Domain</span> {item.get("domain", "")}</div>
579
- <div style="margin-top:0.35rem;"><span class="badge">Scenario</span> {item.get("scenario", "")}</div>
580
- <div style="margin-top:0.35rem;"><span class="badge">Sample</span> <span class="smallmono">{item.get("sample_id", "")}</span></div>
581
- <div style="margin-top:0.35rem;"><span class="badge">Item</span> <span class="smallmono">{item.get("item_id", "")}</span></div>
582
- </div>
583
- """,
584
- unsafe_allow_html=True,
585
  )
586
- st.divider()
587
-
588
- tabs = st.tabs(["Context", "Distractor", "Existing annotations"])
589
- with tabs[0]:
590
- st.markdown("**System instruction**")
591
- st.code(item.get("system_instruction", ""), language="text")
592
- st.markdown("**Conversation**")
593
- render_turns(json.loads(item.get("conversation_json", "[]")))
594
- with tabs[1]:
595
- st.markdown("**Previous assistant turn**")
596
- st.code(item.get("bot_turn", "") or "(missing)", language="text")
597
- st.markdown("**Distractor user turn**")
598
- st.code(item.get("distractor_text", "") or "(missing)", language="text")
599
- with tabs[2]:
600
- existing = anns_df[anns_df["item_id"] == item["item_id"]].copy()
601
- if existing.empty:
602
- st.caption("No annotations yet.")
603
- else:
604
- for _, row in existing.iterrows():
605
- st.write(f"**{row['annotator']}** · {row['status']} · {row['created_at']}")
606
- st.json(row["labels"])
607
- if row.get("notes"):
608
- st.caption(row["notes"])
609
- st.divider()
610
-
611
- with right:
612
- st.markdown("### Annotation form")
613
- current_draft = load_draft(st.session_state["annotator"])
614
- draft_labels = current_draft.get("labels", {}) if current_draft else {}
615
-
616
- with st.form("annotation_form", clear_on_submit=False):
617
- st.selectbox(
618
- "Distractor kind",
619
- LABEL_OPTIONS["distractor_kind"],
620
- index=LABEL_OPTIONS["distractor_kind"].index(draft_labels.get("distractor_kind", LABEL_OPTIONS["distractor_kind"][0]))
621
- if draft_labels.get("distractor_kind") in LABEL_OPTIONS["distractor_kind"]
622
- else 0,
623
- key="distractor_kind",
624
- )
625
- st.selectbox(
626
- "Transition style",
627
- LABEL_OPTIONS["transition_style"],
628
- index=LABEL_OPTIONS["transition_style"].index(draft_labels.get("transition_style", LABEL_OPTIONS["transition_style"][0]))
629
- if draft_labels.get("transition_style") in LABEL_OPTIONS["transition_style"]
630
- else 0,
631
- key="transition_style",
632
- )
633
- st.multiselect(
634
- "Policy target(s)",
635
- LABEL_OPTIONS["policy_target"],
636
- default=draft_labels.get("policy_target", []),
637
- key="policy_target",
638
- )
639
- c1, c2 = st.columns(2)
640
- with c1:
641
- st.slider("Difficulty", 1, 5, value=int(draft_labels.get("difficulty", 3)), key="difficulty")
642
- st.slider("Realism", 1, 5, value=int(draft_labels.get("realism", 3)), key="realism")
643
- with c2:
644
- st.selectbox(
645
- "Assistant behavior",
646
- LABEL_OPTIONS["assistant_behavior"],
647
- index=LABEL_OPTIONS["assistant_behavior"].index(draft_labels.get("assistant_behavior", LABEL_OPTIONS["assistant_behavior"][0]))
648
- if draft_labels.get("assistant_behavior") in LABEL_OPTIONS["assistant_behavior"]
649
- else 0,
650
- key="assistant_behavior",
651
  )
652
- st.slider("Confidence", 1, 5, value=int(draft_labels.get("confidence", 3)), key="confidence")
 
653
 
654
- st.checkbox(
655
- "Multi-turn escalation / persistence",
656
- value=bool(draft_labels.get("multi_turn_escalation", False)),
657
- key="multi_turn_escalation",
658
- )
659
- st.checkbox(
660
- "Assistant followed the rule",
661
- value=bool(draft_labels.get("rule_followed", True)),
662
- key="rule_followed",
663
- )
664
- st.checkbox(
665
- "Borderline / needs review",
666
- value=bool(draft_labels.get("needs_review", False)),
667
- key="needs_review",
668
- )
669
- notes = st.text_area(
670
- "Notes",
671
- value=current_draft.get("notes", ""),
672
- height=150,
673
- placeholder="Explain ambiguity, likely disagreement, or policy edge cases.",
674
- )
675
- submitted = st.form_submit_button("Submit to Hugging Face", use_container_width=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
676
 
677
- c1, c2 = st.columns(2)
 
678
  with c1:
679
- if st.button("Save draft locally", use_container_width=True):
680
- payload = {
681
- "current_item_id": item["item_id"],
682
- "labels": build_labels_from_state(),
683
- "notes": notes,
684
- "saved_at": now_iso(),
685
- }
686
- path = save_draft(st.session_state["annotator"], payload)
687
- st.success(f"Draft saved to {path}")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
688
  with c2:
689
- if st.button("Sync annotation cache", use_container_width=True):
690
- st.session_state["annotations_df"] = load_all_hub_annotations(annotation_repo)
691
- st.success("Reloaded annotation index from Hub.")
692
-
693
- if submitted:
694
- labels = build_labels_from_state()
695
- payload = {
696
- "annotation_id": str(uuid.uuid4()),
697
- "item_id": item["item_id"],
698
- "sample_id": item["sample_id"],
699
- "annotator": st.session_state["annotator"],
700
- "created_at": now_iso(),
701
- "status": "submitted",
702
- "labels": labels,
703
- "notes": notes,
704
- "source": {
705
- "source_dataset_repo": source_repo,
706
- "source_dataset_split": source_split,
707
- "domain": item.get("domain", ""),
708
- "scenario": item.get("scenario", ""),
709
- "distractor_index": int(item.get("distractor_index", 0)),
710
- },
711
- }
 
 
712
  try:
713
- path_in_repo = push_annotation_to_hub(annotation_repo, payload)
714
- st.session_state["annotations_df"] = pd.concat(
715
- [
716
- anns_df,
717
- pd.DataFrame(
718
- [
719
- {
720
- "item_id": payload["item_id"],
721
- "sample_id": payload["sample_id"],
722
- "annotator": payload["annotator"],
723
- "labels": payload["labels"],
724
- "notes": payload["notes"],
725
- "status": payload["status"],
726
- "created_at": payload["created_at"],
727
- "file_path": path_in_repo,
728
- }
729
- ]
730
- ),
731
- ],
732
- ignore_index=True,
733
- )
734
- save_draft(
735
- st.session_state["annotator"],
736
- {
737
- "current_item_id": item["item_id"],
738
- "labels": labels,
739
- "notes": notes,
740
- "saved_at": now_iso(),
741
- },
742
- )
743
- st.success(f"Submitted to Hugging Face as {path_in_repo}")
744
- q = queue_df()
745
- if not q.empty:
746
- set_current_item_id(q.iloc[0]["item_id"])
747
- st.rerun()
748
  except Exception as e:
749
- st.error(f"Upload failed. Saved locally only. Error: {e}")
750
- save_draft(
751
- st.session_state["annotator"],
752
- {
753
- "current_item_id": item["item_id"],
754
- "labels": labels,
755
- "notes": notes,
756
- "saved_at": now_iso(),
757
- },
758
- )
759
-
760
- st.caption("Each submission is a separate file in the annotation dataset repo, so multiple annotators can work in parallel without write conflicts.")
761
 
762
- elif page == "Review":
763
- st.subheader("Inter-rater review")
764
- multi = (
765
- anns_df.groupby("item_id")["annotator"].nunique().reset_index(name="n_annotators")
766
- if not anns_df.empty
767
- else pd.DataFrame(columns=["item_id", "n_annotators"])
768
- )
769
- multi = multi[multi["n_annotators"] >= 2] if not multi.empty else multi
770
-
771
- if multi.empty:
772
- st.info("No items with at least two annotations yet.")
773
- else:
774
- selected_item = st.selectbox("Item with multiple annotations", multi["item_id"].tolist())
775
- row = items_df[items_df["item_id"] == selected_item].iloc[0].to_dict()
776
- sample = samples_df[samples_df["sample_id"] == row["sample_id"]].iloc[0].to_dict()
777
- row.update(sample)
778
-
779
- st.markdown("### Context")
780
- st.code(row["system_instruction"], language="text")
781
- st.code(row["bot_turn"] or "", language="text")
782
- st.code(row["distractor_text"] or "", language="text")
783
-
784
- st.markdown("### Annotations")
785
- sub = anns_df[anns_df["item_id"] == selected_item].copy()
786
- cols = st.columns(min(len(sub), 3)) if len(sub) > 0 else st.columns(1)
787
- for idx, (_, ann) in enumerate(sub.iterrows()):
788
- with cols[idx % len(cols)]:
789
- st.write(f"**{ann['annotator']}**")
790
- st.caption(f"{ann['status']} · {ann['created_at']}")
791
- st.json(ann["labels"])
792
- if ann.get("notes"):
793
- st.caption(ann["notes"])
794
-
795
- agreement = compute_agreement(sub, label_key="assistant_behavior")
796
- c1, c2, c3 = st.columns(3)
797
- c1.metric("Paired items", agreement["paired_items"])
798
- c2.metric("Raw agreement", f"{agreement['raw_agreement']:.2%}" if agreement["raw_agreement"] is not None else "n/a")
799
- c3.metric("Cohen's κ", f"{agreement['cohen_kappa']:.3f}" if agreement["cohen_kappa"] is not None else "n/a")
800
-
801
- elif page == "Dashboard":
802
- st.subheader("Dashboard")
803
- c1, c2, c3, c4 = st.columns(4)
804
- c1.metric("Source samples", len(samples_df))
805
- c2.metric("Source items", len(items_df))
806
- c3.metric("Annotation files", len(anns_df))
807
- c4.metric("My queue", len(queue_df()))
808
-
809
- st.markdown("### Progress by annotator")
810
- if anns_df.empty:
811
- st.info("No annotations yet.")
812
- else:
813
- by_ann = anns_df.groupby("annotator")["item_id"].nunique().reset_index(name="annotated_items").sort_values("annotated_items", ascending=False)
814
- st.dataframe(by_ann, use_container_width=True, hide_index=True)
815
-
816
- st.markdown("### Progress by domain")
817
- joined = anns_df.merge(items_df[["item_id", "domain"]], on="item_id", how="left")
818
- by_domain = joined.groupby("domain")["item_id"].nunique().reset_index(name="annotated_items").sort_values("annotated_items", ascending=False)
819
- st.dataframe(by_domain, use_container_width=True, hide_index=True)
820
-
821
- st.markdown("### Agreement snapshot")
822
- metric = compute_agreement(anns_df, label_key="assistant_behavior")
823
- st.write(metric)
824
-
825
- st.markdown("### Recent annotation previews")
826
- recent = anns_df.sort_values("created_at", ascending=False).head(20).copy()
827
- if "labels" in recent.columns:
828
- recent["assistant_behavior"] = recent["labels"].apply(lambda x: x.get("assistant_behavior") if isinstance(x, dict) else None)
829
- recent["distractor_kind"] = recent["labels"].apply(lambda x: x.get("distractor_kind") if isinstance(x, dict) else None)
830
- st.dataframe(
831
- recent[["annotator", "item_id", "status", "created_at", "assistant_behavior", "distractor_kind", "notes"]],
832
- use_container_width=True,
833
- hide_index=True,
834
- )
835
 
836
  else:
837
- st.subheader("Export")
838
- st.write("Export the merged dataset for downstream analysis or model training.")
 
839
 
840
- merged = items_df.merge(samples_df, on="sample_id", how="left")
841
- if not anns_df.empty:
842
- export_df = merged.merge(anns_df[["item_id", "annotator", "labels", "notes", "status", "created_at"]], on="item_id", how="left")
843
- else:
844
- export_df = merged.copy()
845
- export_df["annotator"] = None
846
- export_df["labels"] = None
847
- export_df["notes"] = None
848
- export_df["status"] = None
849
- export_df["created_at"] = None
850
-
851
- c1, c2 = st.columns(2)
852
  with c1:
853
- jsonl = LOCAL_EXPORT_DIR / "annotations_export.jsonl"
854
- if st.button("Generate JSONL export", use_container_width=True):
855
- with jsonl.open("w", encoding="utf-8") as f:
856
- for _, r in export_df.iterrows():
857
- f.write(json.dumps(r.where(pd.notna(r), None).to_dict(), ensure_ascii=False) + "\n")
858
- st.success(f"Wrote {jsonl}")
859
- st.download_button("Download JSONL", jsonl.read_text(encoding="utf-8"), file_name=jsonl.name, mime="application/json")
860
  with c2:
861
- csv = LOCAL_EXPORT_DIR / "annotations_export.csv"
862
- if st.button("Generate CSV export", use_container_width=True):
863
- export_df.to_csv(csv, index=False)
864
- st.success(f"Wrote {csv}")
865
- st.download_button("Download CSV", csv.read_text(encoding="utf-8"), file_name=csv.name, mime="text/csv")
 
 
 
 
 
 
 
 
 
 
866
 
867
- st.markdown("### Repository handoff")
868
  st.code(
869
- f"Source repo: {source_repo}\nAnnotation repo: {annotation_repo}\nSplit: {source_split}\nAnnotator: {st.session_state['annotator']}",
 
 
 
870
  language="text",
871
  )
872
 
 
3
  import json
4
  import os
5
  import uuid
6
+ from dataclasses import dataclass
7
  from datetime import datetime, timezone
8
  from pathlib import Path
9
  from typing import Any, Dict, List, Optional, Tuple
 
13
  from datasets import load_dataset
14
  from huggingface_hub import HfApi, hf_hub_download
15
 
16
+ try:
17
+ from openai import OpenAI
18
+ except Exception: # optional
19
+ OpenAI = None
20
+
21
  APP_TITLE = "🧭 LLM Annotation Platform"
22
+ DEFAULT_SOURCE_REPO = os.environ.get("SOURCE_DATASET_REPO", "nvidia/CantTalkAboutThis-Topic-Control-Dataset")
23
+ DEFAULT_SOURCE_SPLITS = os.environ.get("SOURCE_DATASET_SPLITS", "train,test")
24
+ DEFAULT_ANNOTATION_REPO = os.environ.get("ANNOTATION_REPO_ID", "YOUR_USERNAME/llm-distractor-annotations")
25
+ DEFAULT_HF_TOKEN = os.environ.get("HF_TOKEN", os.environ.get("HUGGINGFACE_HUB_TOKEN", ""))
26
+
27
+ CACHE_DIR = Path(os.environ.get("CACHE_DIR", "/data/cache"))
28
+ DRAFT_DIR = Path(os.environ.get("DRAFT_DIR", "/data/drafts"))
29
+ EXPORT_DIR = Path(os.environ.get("EXPORT_DIR", "/data/exports"))
30
+ CACHE_DIR.mkdir(parents=True, exist_ok=True)
31
+ DRAFT_DIR.mkdir(parents=True, exist_ok=True)
32
+ EXPORT_DIR.mkdir(parents=True, exist_ok=True)
33
+
34
+ TACTICS = [
35
+ "rephrasing the request",
36
+ "claiming urgency",
37
+ "just a general explanation",
38
+ "flattery + redirect",
39
+ "hypothetical loophole",
40
+ "persisting after refusal",
41
+ "false framing",
42
+ "other",
43
+ ]
44
+
45
+ TURN_ROLES = ["user", "assistant", "system", "tool"]
46
+
47
+ DEFAULT_OUTPUT_TEMPLATE = {
48
+ "domain": "",
49
+ "scenario": "",
50
+ "system_instruction": "",
51
+ "conversation": [],
52
+ "distractors": [],
53
+ "distractors_multiturn": [],
54
+ "conversation_with_distractors": [],
55
+ "split": "train",
56
+ "_review_status": "draft",
57
+ "_needs_human_review": True,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
  }
59
 
60
+ # ---------------------------------------------------------
61
+ # Small utilities
62
+ # ---------------------------------------------------------
63
 
64
  def now_iso() -> str:
65
+ return datetime.now(timezone.utc).isoformat(timespec="seconds")
66
 
67
 
68
+ def slugify(text: str, default: str = "item") -> str:
69
+ text = (text or "").strip().lower()
70
+ text = re.sub(r"[^a-z0-9]+", "-", text)
71
+ text = text.strip("-")
72
+ return text or default
73
 
74
 
75
+ def safe_json_loads(value: str, fallback: Any) -> Any:
76
+ try:
77
+ return json.loads(value)
78
+ except Exception:
79
+ return fallback
80
 
81
 
82
+ def pretty_json(value: Any) -> str:
83
+ return json.dumps(value, ensure_ascii=False, indent=2)
 
 
 
84
 
85
 
86
+ def row_to_dict(row: Any) -> Dict[str, Any]:
87
+ if isinstance(row, pd.Series):
88
+ return row.to_dict()
89
+ if isinstance(row, dict):
90
+ return dict(row)
91
+ return dict(row)
92
 
93
 
94
+ def series_get(record: Dict[str, Any], *keys: str, default: Any = "") -> Any:
95
+ for key in keys:
96
+ if key in record and record[key] not in (None, ""):
97
+ return record[key]
98
+ return default
99
 
100
 
101
+ def ensure_list_of_dicts(value: Any) -> List[Dict[str, Any]]:
102
+ if value is None:
103
+ return []
104
+ if isinstance(value, str):
105
+ value = safe_json_loads(value, [])
106
+ if not isinstance(value, list):
107
+ return []
108
+ out = []
109
+ for item in value:
110
+ if isinstance(item, dict):
111
+ out.append(item)
112
+ else:
113
+ out.append({"value": str(item)})
114
+ return out
115
 
116
 
117
+ def ensure_turns(value: Any) -> List[Dict[str, str]]:
118
+ turns = ensure_list_of_dicts(value)
119
+ out = []
120
+ for t in turns:
121
+ out.append({
122
+ "role": str(t.get("role", "user")),
123
+ "content": str(t.get("content", t.get("text", ""))),
124
+ })
125
+ return out
126
 
127
 
128
+ def normalize_conversation(raw: Any) -> List[Dict[str, str]]:
129
+ return ensure_turns(raw)
130
+
131
+
132
+ def normalize_distractors(raw: Any) -> List[Dict[str, str]]:
133
+ items = ensure_list_of_dicts(raw)
 
 
 
 
134
  out = []
135
+ for d in items:
136
+ out.append({
137
+ "bot_turn": str(d.get("bot_turn", d.get("bot turn", ""))),
138
+ "distractor": str(d.get("distractor", d.get("user_turn", d.get("content", "")))),
139
+ })
 
 
140
  return out
141
 
142
 
143
+ def normalize_multiturn(raw: Any) -> List[Dict[str, Any]]:
144
+ items = ensure_list_of_dicts(raw)
145
+ out = []
146
+ for d in items:
147
+ turns = d.get("turns", [])
148
+ if isinstance(turns, str):
149
+ turns = safe_json_loads(turns, [])
150
+ out.append({
151
+ "off_topic_subject": str(d.get("off_topic_subject", "")),
152
+ "tactic_used": str(d.get("tactic_used", "")),
153
+ "bot_turn": str(d.get("bot_turn", d.get("bot turn", ""))),
154
+ "turns_json": pretty_json(ensure_turns(turns)) if turns else "[]",
155
+ })
156
+ return out
157
 
158
 
159
+ def build_conversation_with_distractors(conversation: List[Dict[str, str]], multiturn: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
160
+ """
161
+ Simple automatic build:
162
+ - keep the base conversation as the first item
163
+ - add a variant conversation for each multiturn distractor by appending the user turns
164
+ after the matching bot_turn when possible.
165
+ """
166
+ if not conversation:
167
+ return []
168
+
169
+ variants = [{"variant": "base", "conversation": conversation}]
170
+ for idx, d in enumerate(multiturn):
171
+ turns = safe_json_loads(d.get("turns_json", "[]"), [])
172
+ if not isinstance(turns, list):
173
+ turns = []
174
+ bot_turn = str(d.get("bot_turn", "")).strip()
175
+
176
+ conv = []
177
+ inserted = False
178
+ for turn in conversation:
179
+ conv.append(turn)
180
+ if not inserted and bot_turn and turn.get("role", "").lower() == "assistant" and turn.get("content", "").strip() == bot_turn:
181
+ conv.extend(ensure_turns(turns))
182
+ inserted = True
183
+ if not inserted:
184
+ # Fallback: append to end
185
+ conv.extend(ensure_turns(turns))
186
+ variants.append({
187
+ "variant": f"distractor_{idx+1}",
188
+ "conversation": conv,
189
+ })
190
+ return variants
191
+
192
+
193
+ def record_from_inputs(
194
+ domain: str,
195
+ scenario: str,
196
+ system_instruction: str,
197
+ conversation: List[Dict[str, str]],
198
+ distractors: List[Dict[str, str]],
199
+ multiturn: List[Dict[str, Any]],
200
+ conversation_with_distractors: Any,
201
+ split: str,
202
+ review_status: str,
203
+ needs_review: bool,
204
+ source_split: str = "",
205
+ source_index: Optional[int] = None,
206
+ source_repo: str = "",
207
+ annotator: str = "",
208
+ ) -> Dict[str, Any]:
209
+ record = {
210
+ "domain": domain.strip(),
211
+ "scenario": scenario.strip(),
212
+ "system_instruction": system_instruction.strip(),
213
+ "conversation": conversation,
214
+ "distractors": distractors,
215
+ "distractors_multiturn": multiturn,
216
+ "conversation_with_distractors": conversation_with_distractors,
217
+ "split": split,
218
+ "_review_status": review_status,
219
+ "_needs_human_review": needs_review,
220
+ "_annotator": annotator,
221
+ "_source_repo": source_repo,
222
+ "_source_split": source_split,
223
+ "_source_index": source_index,
224
+ "_created_at": now_iso(),
225
+ "_updated_at": now_iso(),
226
  }
227
+ return record
228
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
229
 
230
+ def record_to_exportable(record: Dict[str, Any]) -> Dict[str, Any]:
231
+ out = dict(record)
232
+ # keep the same top-level structure as the source file, but preserve provenance
233
+ return out
 
 
 
 
 
 
234
 
235
 
236
+ # ---------------------------------------------------------
237
+ # Data loading
238
+ # ---------------------------------------------------------
239
+
240
+ @st.cache_data(show_spinner=False)
241
+ def load_hf_split(repo_id: str, split: str) -> List[Dict[str, Any]]:
242
+ ds = load_dataset(repo_id, split=split)
243
+ return [dict(r) for r in ds]
244
 
245
 
246
+ @st.cache_data(show_spinner=False)
247
+ def load_hf_all_splits(repo_id: str, splits_csv: str) -> List[Dict[str, Any]]:
248
+ all_rows: List[Dict[str, Any]] = []
249
+ for split in [s.strip() for s in splits_csv.split(",") if s.strip()]:
250
+ try:
251
+ rows = load_hf_split(repo_id, split)
252
+ for i, row in enumerate(rows):
253
+ row = dict(row)
254
+ row.setdefault("split", split)
255
+ row.setdefault("_source_split", split)
256
+ row.setdefault("_source_index", i)
257
+ row.setdefault("_source_repo", repo_id)
258
+ all_rows.append(row)
259
+ except Exception:
260
+ continue
261
+ return all_rows
262
+
263
+
264
+ def load_local_json(path: Path, split_default: str = "train") -> List[Dict[str, Any]]:
265
+ if path.suffix.lower() == ".jsonl":
266
+ rows = []
267
+ with path.open("r", encoding="utf-8") as f:
268
+ for line in f:
269
+ if line.strip():
270
+ rows.append(json.loads(line))
271
+ for i, row in enumerate(rows):
272
+ row.setdefault("split", split_default)
273
+ row.setdefault("_source_split", split_default)
274
+ row.setdefault("_source_index", i)
275
+ return rows
276
  with path.open("r", encoding="utf-8") as f:
277
+ data = json.load(f)
278
+ if not isinstance(data, list):
279
+ raise ValueError("Local JSON must contain a list of records.")
280
+ for i, row in enumerate(data):
281
+ row.setdefault("split", split_default)
282
+ row.setdefault("_source_split", split_default)
283
+ row.setdefault("_source_index", i)
284
+ return data
285
 
286
 
287
+ def coerce_source_records(raw_records: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
288
+ out = []
289
+ for i, r in enumerate(raw_records):
290
+ rec = dict(r)
291
+ rec.setdefault("split", rec.get("_source_split", "train"))
292
+ rec.setdefault("_source_index", i)
293
+ out.append(rec)
294
+ return out
295
 
 
 
 
296
 
297
+ # ---------------------------------------------------------
298
+ # HF persistence
299
+ # ---------------------------------------------------------
300
+
301
+ def hf_client() -> HfApi:
302
+ return HfApi(token=DEFAULT_HF_TOKEN or None)
303
+
304
+
305
+ def ensure_annotation_repo(repo_id: str) -> None:
306
+ if not repo_id or repo_id.startswith("YOUR_"):
307
+ return
308
+ hf_client().create_repo(repo_id=repo_id, repo_type="dataset", private=True, exist_ok=True)
309
+
310
+
311
+ def upload_record_to_hf(repo_id: str, record: Dict[str, Any], annotator: str) -> str:
312
+ ensure_annotation_repo(repo_id)
313
+ stamp = datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ")
314
+ safe_name = slugify(f"{annotator}-{record.get('domain','')}-{record.get('scenario','')}", "entry")
315
+ filename = f"entries/{slugify(annotator, 'annotator')}/{stamp}_{safe_name}_{uuid.uuid4().hex[:8]}.json"
316
+
317
+ tmp_dir = DRAFT_DIR / "_tmp_uploads"
318
+ tmp_dir.mkdir(parents=True, exist_ok=True)
319
+ tmp_file = tmp_dir / f"{uuid.uuid4().hex}.json"
320
+ with tmp_file.open("w", encoding="utf-8") as f:
321
+ json.dump(record_to_exportable(record), f, ensure_ascii=False, indent=2)
322
+
323
+ hf_client().upload_file(
324
+ path_or_fileobj=str(tmp_file),
325
+ path_in_repo=filename,
326
+ repo_id=repo_id,
327
+ repo_type="dataset",
328
+ commit_message=f"Add annotation entry by {annotator}",
329
+ )
330
+ return filename
331
+
332
+
333
+ def list_uploaded_files(repo_id: str) -> List[str]:
334
+ if not repo_id or repo_id.startswith("YOUR_"):
335
+ return []
336
+ try:
337
+ return hf_client().list_repo_files(repo_id, repo_type="dataset")
338
+ except Exception:
339
+ return []
340
 
 
341
 
342
+ # ---------------------------------------------------------
343
+ # Local drafts / state
344
+ # ---------------------------------------------------------
345
 
346
+ def annotator_draft_path(annotator: str) -> Path:
347
+ safe = slugify(annotator, "annotator")
348
+ return DRAFT_DIR / f"{safe}.json"
349
+
350
+
351
+ def save_draft_local(annotator: str, payload: Dict[str, Any]) -> Path:
352
+ path = annotator_draft_path(annotator)
353
  with path.open("w", encoding="utf-8") as f:
354
  json.dump(payload, f, ensure_ascii=False, indent=2)
355
  return path
356
 
357
 
358
+ def load_draft_local(annotator: str) -> Dict[str, Any]:
359
+ path = annotator_draft_path(annotator)
360
  if not path.exists():
361
  return {}
362
  try:
363
+ with path.open("r", encoding="utf-8") as f:
364
+ return json.load(f)
365
  except Exception:
366
  return {}
367
 
368
 
369
+ def append_submission_index(entry: Dict[str, Any]) -> None:
370
+ idx = DRAFT_DIR / "submissions_index.jsonl"
371
+ with idx.open("a", encoding="utf-8") as f:
372
+ f.write(json.dumps(entry, ensure_ascii=False) + "\n")
 
 
 
 
 
 
 
 
 
 
373
 
 
 
 
 
 
374
 
375
+ # ---------------------------------------------------------
376
+ # Editing helpers
377
+ # ---------------------------------------------------------
378
 
379
+ def df_from_turns(turns: List[Dict[str, str]]) -> pd.DataFrame:
380
  if not turns:
381
+ return pd.DataFrame([{"role": "user", "content": ""}])
382
+ return pd.DataFrame(turns)
383
+
384
+
385
+ def turns_from_df(df: pd.DataFrame) -> List[Dict[str, str]]:
386
+ if df is None or df.empty:
387
+ return []
388
+ out = []
389
+ for _, row in df.iterrows():
390
+ role = str(row.get("role", "")).strip()
391
+ content = str(row.get("content", "")).strip()
392
+ if role or content:
393
+ out.append({"role": role or "user", "content": content})
394
+ return out
395
+
396
+
397
+ def df_from_simple_distractors(items: List[Dict[str, str]]) -> pd.DataFrame:
398
+ if not items:
399
+ return pd.DataFrame([{"bot_turn": "", "distractor": ""}])
400
+ return pd.DataFrame(items)
401
+
402
 
403
+ def simple_distractors_from_df(df: pd.DataFrame) -> List[Dict[str, str]]:
404
+ if df is None or df.empty:
405
+ return []
406
+ out = []
407
+ for _, row in df.iterrows():
408
+ bot_turn = str(row.get("bot_turn", "")).strip()
409
+ distractor = str(row.get("distractor", "")).strip()
410
+ if bot_turn or distractor:
411
+ out.append({"bot_turn": bot_turn, "distractor": distractor})
412
+ return out
413
 
 
 
 
 
 
414
 
415
+ def df_from_multiturn(items: List[Dict[str, Any]]) -> pd.DataFrame:
416
+ if not items:
417
+ return pd.DataFrame([{"off_topic_subject": "", "tactic_used": TACTICS[0], "bot_turn": "", "turns_json": "[]"}])
418
+ return pd.DataFrame(items)
419
 
 
 
 
420
 
421
+ def multiturn_from_df(df: pd.DataFrame) -> List[Dict[str, Any]]:
422
+ if df is None or df.empty:
423
+ return []
424
+ out = []
425
+ for _, row in df.iterrows():
426
+ subject = str(row.get("off_topic_subject", "")).strip()
427
+ tactic = str(row.get("tactic_used", "")).strip()
428
+ bot_turn = str(row.get("bot_turn", "")).strip()
429
+ turns_json = str(row.get("turns_json", "[]")).strip()
430
+ turns = safe_json_loads(turns_json, [])
431
+ if isinstance(turns, list):
432
+ turns = ensure_turns(turns)
433
+ else:
434
+ turns = []
435
+ if subject or bot_turn or turns:
436
+ out.append({
437
+ "off_topic_subject": subject,
438
+ "tactic_used": tactic,
439
+ "bot_turn": bot_turn,
440
+ "turns_json": pretty_json(turns) if turns else "[]",
441
+ })
442
+ return out
443
+
444
+
445
+ def normalize_draft_from_record(record: Dict[str, Any], source_repo: str = "", source_split: str = "", source_index: Optional[int] = None) -> Dict[str, Any]:
446
+ conversation = normalize_conversation(record.get("conversation"))
447
+ distractors = normalize_distractors(record.get("distractors"))
448
+ multiturn = normalize_multiturn(record.get("distractors_multiturn"))
449
+ convwd = record.get("conversation_with_distractors", [])
450
+ if not isinstance(convwd, list):
451
+ convwd = []
452
+ if not convwd and multiturn:
453
+ convwd = build_conversation_with_distractors(conversation, multiturn)
454
  return {
455
+ "domain": str(series_get(record, "domain", default="")),
456
+ "scenario": str(series_get(record, "scenario", default="")),
457
+ "system_instruction": str(series_get(record, "system_instruction", default="")),
458
+ "conversation": conversation,
459
+ "distractors": distractors,
460
+ "distractors_multiturn": multiturn,
461
+ "conversation_with_distractors": convwd,
462
+ "split": str(series_get(record, "split", "_source_split", default="train")),
463
+ "_review_status": str(series_get(record, "_review_status", default="draft")),
464
+ "_needs_human_review": bool(record.get("_needs_human_review", True)),
465
+ "_source_repo": source_repo or str(series_get(record, "_source_repo", default="")),
466
+ "_source_split": source_split or str(series_get(record, "_source_split", default="")),
467
+ "_source_index": source_index if source_index is not None else record.get("_source_index"),
468
+ "_annotator": str(series_get(record, "_annotator", default="")),
469
  }
470
 
471
 
472
+ def make_blank_draft() -> Dict[str, Any]:
473
+ return dict(DEFAULT_OUTPUT_TEMPLATE)
 
 
 
 
 
474
 
 
 
 
 
 
 
 
 
 
475
 
476
+ def generate_llm_distractor_draft(draft: Dict[str, Any], base_url: str, model: str, mode: str = "simple") -> Optional[Dict[str, Any]]:
477
+ if OpenAI is None:
478
+ st.error("The openai package is not installed.")
479
+ return None
480
+
481
+ client = OpenAI(base_url=base_url, api_key="lm-studio")
482
+ convo = draft.get("conversation", [])
483
+ sysinst = draft.get("system_instruction", "")
484
+ domain = draft.get("domain", "")
485
+ scenario = draft.get("scenario", "")
486
+
487
+ if mode == "simple":
488
+ prompt = f"""
489
+ You are helping create a human-made distractor dataset for a task-oriented assistant.
490
 
491
+ Domain: {domain}
492
+ Scenario: {scenario}
493
 
494
+ System instruction:
495
+ {sysinst}
496
+
497
+ Conversation:
498
+ {json.dumps(convo, ensure_ascii=False, indent=2)}
499
+
500
+ Write ONE realistic off-topic distractor pair:
501
+ - bot_turn: exact assistant turn from the conversation to anchor after
502
+ - distractor: the user's off-topic message
503
+
504
+ Return only valid JSON with keys bot_turn and distractor.
505
+ """
506
+ else:
507
+ prompt = f"""
508
+ You are helping create a human-made multi-turn distractor dataset for a task-oriented assistant.
509
+
510
+ Domain: {domain}
511
+ Scenario: {scenario}
512
+
513
+ System instruction:
514
+ {sysinst}
515
+
516
+ Conversation:
517
+ {json.dumps(convo, ensure_ascii=False, indent=2)}
518
+
519
+ Write ONE multi-turn distractor item:
520
+ - off_topic_subject
521
+ - tactic_used
522
+ - bot_turn
523
+ - turns: a JSON list of 3-5 turns that starts with a user off-topic request and escalates politely after refusals.
524
+
525
+ Return only valid JSON with keys off_topic_subject, tactic_used, bot_turn, turns.
526
+ """
527
 
 
 
528
  try:
529
+ response = client.chat.completions.create(
530
+ model=model,
531
+ messages=[
532
+ {"role": "system", "content": "Return valid JSON only."},
533
+ {"role": "user", "content": prompt},
534
+ ],
535
+ temperature=0.8,
536
+ max_tokens=1500,
537
+ )
538
+ raw = response.choices[0].message.content.strip()
539
+ if raw.startswith("```"):
540
+ raw = raw.strip("`")
541
+ raw = raw.replace("json\n", "", 1)
542
+ return json.loads(raw)
543
+ except Exception as e:
544
+ st.error(f"Local LLM generation failed: {e}")
545
+ return None
546
 
547
 
548
+ # ---------------------------------------------------------
549
+ # UI components
550
+ # ---------------------------------------------------------
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
551
 
552
+ def render_preview_df(records: List[Dict[str, Any]], split_filter: str, search_text: str = "") -> pd.DataFrame:
553
+ rows = []
554
+ search_text = search_text.lower().strip()
555
+ for i, r in enumerate(records):
556
+ if split_filter and split_filter != "All" and str(r.get("split", r.get("_source_split", ""))) != split_filter:
557
+ continue
558
+ domain = str(series_get(r, "domain", default=""))
559
+ scenario = str(series_get(r, "scenario", default=""))
560
+ if search_text:
561
+ joined = " ".join([domain, scenario, str(series_get(r, "system_instruction", default=""))]).lower()
562
+ if search_text not in joined:
563
+ continue
564
+ convo = normalize_conversation(r.get("conversation"))
565
+ preview = ""
566
+ if convo:
567
+ for t in reversed(convo):
568
+ if str(t.get("role", "")).lower() == "user":
569
+ preview = str(t.get("content", ""))
570
+ break
571
+ if not preview:
572
+ preview = str(convo[-1].get("content", ""))
573
+ rows.append({
574
+ "#": i,
575
+ "split": str(r.get("split", r.get("_source_split", ""))),
576
+ "domain": domain,
577
+ "scenario": scenario,
578
+ "conversation_preview": (preview[:120] + "…") if len(preview) > 120 else preview,
579
+ "distractor_count": len(r.get("distractors", [])) if isinstance(r.get("distractors"), list) else 0,
580
+ "multi_count": len(r.get("distractors_multiturn", [])) if isinstance(r.get("distractors_multiturn"), list) else 0,
581
+ })
582
+ return pd.DataFrame(rows)
583
+
584
+
585
+ def current_source_record(records: List[Dict[str, Any]], idx: int) -> Optional[Dict[str, Any]]:
586
+ if idx < 0 or idx >= len(records):
587
+ return None
588
+ return records[idx]
589
+
590
+
591
+ def clean_editor_df(df: pd.DataFrame) -> pd.DataFrame:
592
+ if df is None:
593
+ return pd.DataFrame()
594
+ df = df.copy()
595
+ for col in df.columns:
596
+ df[col] = df[col].fillna("")
597
+ return df
598
+
599
+
600
+ # ---------------------------------------------------------
601
+ # App
602
+ # ---------------------------------------------------------
603
 
604
+ def main() -> None:
605
+ st.set_page_config(page_title=APP_TITLE, page_icon="🧭", layout="wide")
606
+ st.title(APP_TITLE)
607
+ st.caption("Simple collaborative editor for human-made distractor datasets.")
608
+
609
+ # Session defaults
610
+ for key, default in [
611
+ ("annotator", "annotator_1"),
612
+ ("source_mode", "HF dataset"),
613
+ ("source_repo", DEFAULT_SOURCE_REPO),
614
+ ("source_splits", DEFAULT_SOURCE_SPLITS),
615
+ ("annotation_repo", DEFAULT_ANNOTATION_REPO),
616
+ ("source_file_name", ""),
617
+ ("source_row_idx", 0),
618
+ ("draft", make_blank_draft()),
619
+ ("draft_source_idx", None),
620
+ ("draft_source_split", "train"),
621
+ ("draft_mode", "new"),
622
+ ("last_saved_message", ""),
623
+ ("llm_base_url", "http://localhost:1234/v1"),
624
+ ("llm_model", "gemma-4-e2b-it"),
625
+ ("llm_mode", "simple"),
626
+ ]:
627
+ if key not in st.session_state:
628
+ st.session_state[key] = default
629
+
630
+ # Sidebar
631
+ st.sidebar.header("Workspace")
632
+ st.session_state["annotator"] = st.sidebar.text_input("Annotator name", value=st.session_state["annotator"])
633
+ st.session_state["source_mode"] = st.sidebar.radio("Source mode", ["HF dataset", "Upload local JSON/JSONL"], index=0 if st.session_state["source_mode"] == "HF dataset" else 1)
634
+ st.session_state["source_repo"] = st.sidebar.text_input("Source dataset repo", value=st.session_state["source_repo"])
635
+ st.session_state["source_splits"] = st.sidebar.text_input("Source splits (comma-separated)", value=st.session_state["source_splits"])
636
+ st.session_state["annotation_repo"] = st.sidebar.text_input("Annotation dataset repo", value=st.session_state["annotation_repo"])
637
+ st.sidebar.divider()
638
+ st.session_state["llm_base_url"] = st.sidebar.text_input("Local LLM base URL", value=st.session_state["llm_base_url"])
639
+ st.session_state["llm_model"] = st.sidebar.text_input("Local LLM model", value=st.session_state["llm_model"])
640
+ st.session_state["llm_mode"] = st.sidebar.selectbox("LLM generation mode", ["simple", "multiturn"], index=0 if st.session_state["llm_mode"] == "simple" else 1)
641
+ st.sidebar.caption("For LM Studio / OpenAI-compatible local servers, keep the base URL like http://localhost:1234/v1.")
642
+ st.sidebar.divider()
643
+
644
+ uploaded_file = None
645
+ if st.session_state["source_mode"] == "Upload local JSON/JSONL":
646
+ uploaded_file = st.sidebar.file_uploader("Upload source file", type=["json", "jsonl"])
647
+ if uploaded_file is not None:
648
+ st.session_state["source_file_name"] = uploaded_file.name
649
+
650
+ page = st.sidebar.radio("Page", ["Browse", "Edit / Create", "Drafts", "Export / Sync"], index=0)
651
+ st.sidebar.caption(f"HF token present: {'yes' if DEFAULT_HF_TOKEN else 'no'}")
652
+ st.sidebar.caption(f"Draft folder: {DRAFT_DIR}")
653
+
654
+ # Load source records
655
  if "source_records" not in st.session_state:
656
  st.session_state["source_records"] = None
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
657
 
658
  if st.session_state["source_records"] is None:
659
+ with st.spinner("Loading source data..."):
 
 
 
 
 
 
 
660
  try:
661
+ if st.session_state["source_mode"] == "HF dataset":
662
+ records = load_hf_all_splits(st.session_state["source_repo"], st.session_state["source_splits"])
663
+ else:
664
+ if uploaded_file is not None:
665
+ suffix = Path(uploaded_file.name).suffix.lower()
666
+ tmp_path = DRAFT_DIR / f"uploaded_source{suffix}"
667
+ tmp_path.write_bytes(uploaded_file.getbuffer())
668
+ records = load_local_json(tmp_path)
669
+ else:
670
+ records = []
671
+ st.session_state["source_records"] = coerce_source_records(records)
672
  except Exception as e:
673
+ st.session_state["source_records"] = []
674
+ st.error(f"Could not load source data: {e}")
675
+
676
+ records: List[Dict[str, Any]] = st.session_state["source_records"] or []
677
+
678
+ if page == "Browse":
679
+ st.subheader("Browse source dataset")
680
+ split_choices = ["All"] + sorted({str(r.get("split", r.get("_source_split", ""))) for r in records if str(r.get("split", r.get("_source_split", "")))} )
681
+ col1, col2 = st.columns([1, 1])
682
+ with col1:
683
+ split_filter = st.selectbox("Filter split", split_choices, index=0)
684
+ with col2:
685
+ search_text = st.text_input("Search text (domain / scenario / instruction)", value="")
686
+ preview_df = render_preview_df(records, split_filter, search_text)
687
+ st.write(f"Rows loaded: {len(preview_df)}")
688
+ st.dataframe(preview_df, use_container_width=True, hide_index=True)
689
+
690
+ if preview_df.empty:
691
+ st.info("No rows match the current filter.")
692
+ else:
693
+ picked = st.number_input("Pick row number (#)", min_value=0, max_value=max(0, len(preview_df) - 1), value=0, step=1)
694
+ if st.button("Load selected row into editor"):
695
+ selected_global_idx = int(preview_df.iloc[int(picked)]["#"])
696
+ st.session_state["draft_mode"] = "clone"
697
+ st.session_state["draft_source_idx"] = selected_global_idx
698
+ st.session_state["draft_source_split"] = str(records[selected_global_idx].get("split", records[selected_global_idx].get("_source_split", "train")))
699
+ st.session_state["draft"] = normalize_draft_from_record(
700
+ records[selected_global_idx],
701
+ source_repo=st.session_state["source_repo"],
702
+ source_split=st.session_state["draft_source_split"],
703
+ source_index=selected_global_idx,
704
+ )
705
+ st.success(f"Loaded row {selected_global_idx} into the editor.")
706
+ st.rerun()
707
+
708
+ st.markdown("### Record inspector")
709
+ if preview_df.empty:
710
+ st.stop()
711
+ idx = int(preview_df.iloc[int(picked)]["#"])
712
+ rec = records[idx]
713
+ st.json({
714
+ "domain": rec.get("domain", ""),
715
+ "scenario": rec.get("scenario", ""),
716
+ "split": rec.get("split", rec.get("_source_split", "")),
717
+ "keys": list(rec.keys()),
718
+ })
719
+ st.markdown("**Conversation preview**")
720
+ st.code(pretty_json(rec.get("conversation", [])), language="json")
721
+ st.markdown("**Distractors preview**")
722
+ st.code(pretty_json(rec.get("distractors", [])), language="json")
723
+
724
+ elif page == "Edit / Create":
725
+ st.subheader("Create or edit an entry")
726
  left, right = st.columns([1.05, 0.95], gap="large")
727
 
728
  with left:
729
+ c1, c2, c3 = st.columns([1, 1, 1])
730
+ with c1:
731
+ if st.button("New blank entry"):
732
+ st.session_state["draft_mode"] = "new"
733
+ st.session_state["draft_source_idx"] = None
734
+ st.session_state["draft"] = make_blank_draft()
735
+ st.success("Blank entry created.")
 
 
 
 
 
736
  st.rerun()
737
+ with c2:
738
+ if st.button("Reset draft from source row"):
739
+ idx = st.session_state.get("draft_source_idx")
740
+ if idx is not None and 0 <= idx < len(records):
741
+ st.session_state["draft"] = normalize_draft_from_record(
742
+ records[idx],
743
+ source_repo=st.session_state["source_repo"],
744
+ source_split=str(records[idx].get("split", records[idx].get("_source_split", "train"))),
745
+ source_index=idx,
746
+ )
747
+ st.success("Draft reset from source row.")
748
+ else:
749
+ st.warning("No source row selected.")
750
+ with c3:
751
+ if st.button("Auto-build conversation_with_distractors"):
752
+ d = st.session_state["draft"]
753
+ d["conversation_with_distractors"] = build_conversation_with_distractors(d.get("conversation", []), d.get("distractors_multiturn", []))
754
+ st.session_state["draft"] = d
755
+ st.success("Built conversation_with_distractors.")
756
  st.rerun()
757
 
758
+ st.markdown("### Source row")
759
+ row_idx = st.number_input(
760
+ "Source row index",
761
+ min_value=0,
762
+ max_value=max(0, len(records) - 1),
763
+ value=int(st.session_state.get("draft_source_idx") or 0),
764
+ step=1,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
765
  )
766
+ source_split_guess = ""
767
+ if records:
768
+ source_split_guess = str(records[int(row_idx)].get("split", records[int(row_idx)].get("_source_split", "train")))
769
+ st.write("Detected source split:", source_split_guess or "n/a")
770
+ if st.button("Load this row"):
771
+ idx = int(row_idx)
772
+ if 0 <= idx < len(records):
773
+ st.session_state["draft_mode"] = "clone"
774
+ st.session_state["draft_source_idx"] = idx
775
+ st.session_state["draft_source_split"] = str(records[idx].get("split", records[idx].get("_source_split", "train")))
776
+ st.session_state["draft"] = normalize_draft_from_record(
777
+ records[idx],
778
+ source_repo=st.session_state["source_repo"],
779
+ source_split=st.session_state["draft_source_split"],
780
+ source_index=idx,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
781
  )
782
+ st.success(f"Loaded source row {idx}.")
783
+ st.rerun()
784
 
785
+ draft = st.session_state["draft"]
786
+
787
+ top1, top2, top3 = st.columns(3)
788
+ with top1:
789
+ draft["split"] = st.selectbox("Entry split", ["train", "test"], index=0 if str(draft.get("split", "train")) == "train" else 1)
790
+ with top2:
791
+ draft["_review_status"] = st.selectbox("Review status", ["draft", "approved", "failed"], index=["draft", "approved", "failed"].index(str(draft.get("_review_status", "draft"))))
792
+ with top3:
793
+ draft["_needs_human_review"] = st.checkbox("Needs human review", value=bool(draft.get("_needs_human_review", True)))
794
+
795
+ draft["domain"] = st.text_input("Domain", value=str(draft.get("domain", "")))
796
+ draft["scenario"] = st.text_input("Scenario", value=str(draft.get("scenario", "")))
797
+ draft["system_instruction"] = st.text_area("System instruction", value=str(draft.get("system_instruction", "")), height=180)
798
+
799
+ st.markdown("#### Conversation")
800
+ conv_df = clean_editor_df(pd.DataFrame(draft.get("conversation", [{"role": "user", "content": ""}])))
801
+ conv_df = st.data_editor(
802
+ conv_df,
803
+ num_rows="dynamic",
804
+ use_container_width=True,
805
+ column_config={
806
+ "role": st.column_config.SelectboxColumn("role", options=TURN_ROLES, required=True),
807
+ "content": st.column_config.TextColumn("content", required=True),
808
+ },
809
+ hide_index=True,
810
+ key="conversation_editor",
811
+ )
812
+ draft["conversation"] = turns_from_df(conv_df)
813
+ if st.button("Clear conversation"):
814
+ draft["conversation"] = []
815
+ st.session_state["draft"] = draft
816
+ st.rerun()
817
+
818
+ st.markdown("#### Simple distractors")
819
+ simple_df = clean_editor_df(pd.DataFrame(draft.get("distractors", [{"bot_turn": "", "distractor": ""}])))
820
+ simple_df = st.data_editor(
821
+ simple_df,
822
+ num_rows="dynamic",
823
+ use_container_width=True,
824
+ column_config={
825
+ "bot_turn": st.column_config.TextColumn("bot_turn"),
826
+ "distractor": st.column_config.TextColumn("distractor"),
827
+ },
828
+ hide_index=True,
829
+ key="simple_distractors_editor",
830
+ )
831
+ draft["distractors"] = simple_distractors_from_df(simple_df)
832
+ if st.button("Clear simple distractors"):
833
+ draft["distractors"] = []
834
+ st.session_state["draft"] = draft
835
+ st.rerun()
836
+
837
+ st.markdown("#### Multi-turn distractors")
838
+ multi_df = clean_editor_df(pd.DataFrame(draft.get("distractors_multiturn", [{"off_topic_subject": "", "tactic_used": TACTICS[0], "bot_turn": "", "turns_json": "[]"}])))
839
+ multi_df = st.data_editor(
840
+ multi_df,
841
+ num_rows="dynamic",
842
+ use_container_width=True,
843
+ column_config={
844
+ "off_topic_subject": st.column_config.TextColumn("off_topic_subject"),
845
+ "tactic_used": st.column_config.SelectboxColumn("tactic_used", options=TACTICS, required=False),
846
+ "bot_turn": st.column_config.TextColumn("bot_turn"),
847
+ "turns_json": st.column_config.TextColumn("turns_json", help="JSON list of turns, e.g. [{\"role\":\"user\",\"content\":\"...\"}]"),
848
+ },
849
+ hide_index=True,
850
+ key="multi_distractors_editor",
851
+ )
852
+ draft["distractors_multiturn"] = multiturn_from_df(multi_df)
853
+ if st.button("Clear multi-turn distractors"):
854
+ draft["distractors_multiturn"] = []
855
+ st.session_state["draft"] = draft
856
+ st.rerun()
857
+
858
+ st.markdown("#### Conversation with distractors")
859
+ if st.button("Auto-generate conversation_with_distractors from current draft"):
860
+ draft["conversation_with_distractors"] = build_conversation_with_distractors(draft.get("conversation", []), draft.get("distractors_multiturn", []))
861
+ st.session_state["draft"] = draft
862
+ convwd_text = st.text_area(
863
+ "conversation_with_distractors (JSON)",
864
+ value=pretty_json(draft.get("conversation_with_distractors", [])),
865
+ height=200,
866
+ )
867
+ if st.button("Apply conversation_with_distractors JSON"):
868
+ draft["conversation_with_distractors"] = safe_json_loads(convwd_text, [])
869
+ st.session_state["draft"] = draft
870
 
871
+ st.markdown("#### Quick LLM assist")
872
+ c1, c2 = st.columns([1, 1])
873
  with c1:
874
+ if st.button("Generate draft with local LLM"):
875
+ out = generate_llm_distractor_draft(
876
+ draft,
877
+ base_url=st.session_state["llm_base_url"],
878
+ model=st.session_state["llm_model"],
879
+ mode=st.session_state["llm_mode"],
880
+ )
881
+ if out:
882
+ if st.session_state["llm_mode"] == "simple":
883
+ draft.setdefault("distractors", [])
884
+ draft["distractors"].append({
885
+ "bot_turn": out.get("bot_turn", ""),
886
+ "distractor": out.get("distractor", ""),
887
+ })
888
+ else:
889
+ draft.setdefault("distractors_multiturn", [])
890
+ draft["distractors_multiturn"].append({
891
+ "off_topic_subject": out.get("off_topic_subject", ""),
892
+ "tactic_used": out.get("tactic_used", ""),
893
+ "bot_turn": out.get("bot_turn", ""),
894
+ "turns_json": pretty_json(ensure_turns(out.get("turns", []))),
895
+ })
896
+ st.session_state["draft"] = draft
897
+ st.success("LLM draft inserted into the editor.")
898
+ st.rerun()
899
  with c2:
900
+ st.caption("This calls a local OpenAI-compatible server such as LM Studio.")
901
+
902
+ st.markdown("#### Save / submit")
903
+ if st.button("Save draft locally"):
904
+ draft["_annotator"] = st.session_state["annotator"]
905
+ draft["_updated_at"] = now_iso()
906
+ path = save_draft_local(st.session_state["annotator"], draft)
907
+ st.success(f"Draft saved: {path}")
908
+ if st.button("Submit current entry to HF repo"):
909
+ final_record = record_from_inputs(
910
+ domain=draft.get("domain", ""),
911
+ scenario=draft.get("scenario", ""),
912
+ system_instruction=draft.get("system_instruction", ""),
913
+ conversation=draft.get("conversation", []),
914
+ distractors=draft.get("distractors", []),
915
+ multiturn=draft.get("distractors_multiturn", []),
916
+ conversation_with_distractors=draft.get("conversation_with_distractors", []),
917
+ split=str(draft.get("split", "train")),
918
+ review_status=str(draft.get("_review_status", "draft")),
919
+ needs_review=bool(draft.get("_needs_human_review", True)),
920
+ source_split=st.session_state.get("draft_source_split", ""),
921
+ source_index=st.session_state.get("draft_source_idx"),
922
+ source_repo=st.session_state["source_repo"],
923
+ annotator=st.session_state["annotator"],
924
+ )
925
  try:
926
+ filename = upload_record_to_hf(st.session_state["annotation_repo"], final_record, st.session_state["annotator"])
927
+ append_submission_index({
928
+ "annotator": st.session_state["annotator"],
929
+ "uploaded_file": filename,
930
+ "split": final_record.get("split", ""),
931
+ "domain": final_record.get("domain", ""),
932
+ "scenario": final_record.get("scenario", ""),
933
+ "created_at": now_iso(),
934
+ })
935
+ save_draft_local(st.session_state["annotator"], draft)
936
+ st.success(f"Submitted to HF as {filename}")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
937
  except Exception as e:
938
+ st.error(f"HF upload failed: {e}")
939
+ st.warning("The draft remains saved locally in the bucket.")
 
 
 
 
 
 
 
 
 
 
940
 
941
+ with right:
942
+ st.markdown("### Current draft preview")
943
+ st.json(st.session_state["draft"])
944
+ st.markdown("### Quick notes")
945
+ st.write("The output keeps the same top-level structure as the source file and adds provenance fields such as split, annotator, and source index.")
946
+ st.write("You can edit each cell directly in the tables, add rows dynamically, and clear whole sections with the buttons on the left.")
947
+
948
+ elif page == "Drafts":
949
+ st.subheader("Drafts and submissions")
950
+ draft = load_draft_local(st.session_state["annotator"])
951
+ c1, c2 = st.columns([1, 1])
952
+ with c1:
953
+ st.markdown("### Saved local draft")
954
+ if draft:
955
+ st.json(draft)
956
+ else:
957
+ st.info("No draft saved for this annotator yet.")
958
+ with c2:
959
+ st.markdown("### Submission index")
960
+ idx_file = DRAFT_DIR / "submissions_index.jsonl"
961
+ if idx_file.exists():
962
+ lines = idx_file.read_text(encoding="utf-8").splitlines()
963
+ rows = [json.loads(x) for x in lines if x.strip()]
964
+ st.dataframe(pd.DataFrame(rows), use_container_width=True, hide_index=True)
965
+ else:
966
+ st.info("No submissions recorded yet.")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
967
 
968
  else:
969
+ st.subheader("Export / Sync")
970
+ st.write("Export current source + drafts as a merged JSONL or CSV, or inspect HF uploads.")
971
+ current_draft = st.session_state.get("draft", make_blank_draft())
972
 
973
+ # Build a merged dataset view from source records plus local draft if populated
974
+ merged = [dict(r) for r in records]
975
+ if current_draft and current_draft.get("domain") and current_draft.get("scenario"):
976
+ merged.append(record_to_exportable(current_draft))
977
+
978
+ c1, c2, c3 = st.columns(3)
 
 
 
 
 
 
979
  with c1:
980
+ if st.button("Write merged JSONL export"):
981
+ path = EXPORT_DIR / "merged_dataset.jsonl"
982
+ with path.open("w", encoding="utf-8") as f:
983
+ for r in merged:
984
+ f.write(json.dumps(r, ensure_ascii=False) + "\n")
985
+ st.success(f"Wrote {path}")
986
+ st.download_button("Download merged JSONL", data=path.read_text(encoding="utf-8"), file_name=path.name, mime="application/json")
987
  with c2:
988
+ if st.button("Write merged CSV export"):
989
+ path = EXPORT_DIR / "merged_dataset.csv"
990
+ pd.json_normalize(merged).to_csv(path, index=False)
991
+ st.success(f"Wrote {path}")
992
+ st.download_button("Download merged CSV", data=path.read_text(encoding="utf-8"), file_name=path.name, mime="text/csv")
993
+ with c3:
994
+ if st.button("Refresh HF file list"):
995
+ st.rerun()
996
+
997
+ st.markdown("### Uploaded files in annotation repo")
998
+ files = list_uploaded_files(st.session_state["annotation_repo"])
999
+ if files:
1000
+ st.dataframe(pd.DataFrame({"file": files}), use_container_width=True, hide_index=True)
1001
+ else:
1002
+ st.info("No repository files listed yet, or repo is not configured.")
1003
 
1004
+ st.markdown("### Repository settings to remember")
1005
  st.code(
1006
+ f"SOURCE_DATASET_REPO={st.session_state['source_repo']}\n"
1007
+ f"SOURCE_DATASET_SPLITS={st.session_state['source_splits']}\n"
1008
+ f"ANNOTATION_REPO_ID={st.session_state['annotation_repo']}\n"
1009
+ f"HF_TOKEN={'set' if DEFAULT_HF_TOKEN else 'missing'}",
1010
  language="text",
1011
  )
1012
 
hf-space/hf-space/hf-space/app.py CHANGED
@@ -543,11 +543,33 @@ def main() -> None:
543
  item = current_item_row()
544
  if item is None:
545
  st.info("Claim an item to start. The app keeps a per-annotator queue so multiple people can work in parallel.")
 
546
  q = queue_df().head(10)
 
 
 
 
547
  if not q.empty:
548
- display = q[["item_id", "sample_id", "domain", "scenario", "distractor_index"]].copy()
549
- display["preview"] = q["distractor_text"].map(preview_text)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
550
  st.dataframe(display, use_container_width=True, hide_index=True)
 
551
  return
552
 
553
  st.markdown(
 
543
  item = current_item_row()
544
  if item is None:
545
  st.info("Claim an item to start. The app keeps a per-annotator queue so multiple people can work in parallel.")
546
+
547
  q = queue_df().head(10)
548
+
549
+ # DEBUG: inspect actual dataset schema
550
+ st.write("Dataset columns:", list(q.columns))
551
+
552
  if not q.empty:
553
+
554
+ # Only use columns that actually exist
555
+ available_cols = [
556
+ c for c in [
557
+ "item_id",
558
+ "sample_id",
559
+ "domain",
560
+ "scenario",
561
+ "distractor_index"
562
+ ]
563
+ if c in q.columns
564
+ ]
565
+
566
+ display = q[available_cols].copy()
567
+
568
+ if "distractor_text" in q.columns:
569
+ display["preview"] = q["distractor_text"].map(preview_text)
570
+
571
  st.dataframe(display, use_container_width=True, hide_index=True)
572
+
573
  return
574
 
575
  st.markdown(
hf-space/hf-space/hf-space/hf-space/README.md CHANGED
@@ -1,3 +1,12 @@
 
 
 
 
 
 
 
 
 
1
  # LLM Annotation Platform — Hugging Face native
2
 
3
  This version removes the external database layer.
 
1
+ ---
2
+ title: LLM Annotation Platform
3
+ emoji: 🧠
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: docker
7
+ pinned: false
8
+ ---
9
+
10
  # LLM Annotation Platform — Hugging Face native
11
 
12
  This version removes the external database layer.
hf-space/hf-space/hf-space/hf-space/hf-space/.env.example ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ SOURCE_DATASET_REPO=nvidia/CantTalkAboutThis-Topic-Control-Dataset
2
+ SOURCE_DATASET_SPLIT=train
3
+ ANNOTATION_REPO_ID=YOUR_ORG/llm-distractor-annotations
4
+ HF_TOKEN=
5
+ CACHE_DIR=/data/hf_annotation_cache
6
+ DRAFT_DIR=/data/hf_annotation_drafts
7
+ EXPORT_DIR=/data/hf_annotation_exports
hf-space/hf-space/hf-space/hf-space/hf-space/.github/workflows/sync-to-hf.yml ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Sync to Hugging Face Space
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - main
7
+
8
+ jobs:
9
+ sync-to-hub:
10
+ runs-on: ubuntu-latest
11
+
12
+ steps:
13
+ - name: Checkout repository
14
+ uses: actions/checkout@v4
15
+ with:
16
+ lfs: true
17
+
18
+ - name: Push to Hugging Face
19
+ env:
20
+ HF_TOKEN: ${{ secrets.HF_TOKEN }}
21
+ run: |
22
+ git config --global user.email "github-actions@github.com"
23
+ git config --global user.name "GitHub Actions"
24
+
25
+ git clone https://user:$HF_TOKEN@huggingface.co/spaces/keepingLLMontrack/llm-annotation-platform hf-space
26
+
27
+ rsync -av --exclude '.git' ./ hf-space/
28
+
29
+ cd hf-space
30
+
31
+ git add .
32
+
33
+ git commit -m "Sync from GitHub" || echo "No changes to commit"
34
+
35
+ git push
hf-space/hf-space/hf-space/hf-space/hf-space/.gitignore ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ __pycache__/
2
+ *.pyc
3
+ .streamlit/
4
+ data/
5
+ exports/
6
+ .env
7
+ .DS_Store
hf-space/hf-space/hf-space/hf-space/hf-space/Dockerfile ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.11-slim
2
+
3
+ WORKDIR /app
4
+
5
+ COPY . /app
6
+
7
+ RUN pip install --no-cache-dir -r requirements.txt
8
+
9
+ EXPOSE 7860
10
+
11
+ CMD ["streamlit", "run", "app.py", "--server.port", "7860", "--server.address", "0.0.0.0"]
hf-space/hf-space/hf-space/hf-space/hf-space/README.md CHANGED
@@ -1,10 +1,84 @@
1
- ---
2
- title: Llm Annotation Platform
3
- emoji: 🦀
4
- colorFrom: indigo
5
- colorTo: blue
6
- sdk: docker
7
- pinned: false
8
- ---
9
-
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # LLM Annotation Platform — Hugging Face native
2
+
3
+ This version removes the external database layer.
4
+
5
+ ## What it uses
6
+
7
+ - **Hugging Face Space** for the Streamlit app
8
+ - **Hugging Face dataset repo** for the canonical annotation store
9
+ - **Hugging Face Storage Bucket** only for persistent local cache / drafts in the Space
10
+ - **No Supabase**
11
+ - **No separate backend platform**
12
+
13
+ Hugging Face Spaces provide ephemeral disk by default, and Hugging Face recommends attaching Storage Buckets to persist data across restarts. Buckets are mounted into the Space container as local volumes. citeturn322583view0
14
+
15
+ ## Repository structure
16
+
17
+ ```text
18
+ app.py
19
+ scripts/seed.py
20
+ requirements.txt
21
+ README.md
22
+ ```
23
+
24
+ ## Behavior
25
+
26
+ Each annotation is written as its own JSON file into the dataset repository:
27
+ ```text
28
+ annotations/<annotator>/<timestamp>_<item_id>_<uuid>.json
29
+ ```
30
+
31
+ That design avoids write conflicts between annotators because each submission is a new file, not an overwrite of a shared database row. Repository files on the Hub are versioned, and the Hub supports uploading files to dataset repositories. citeturn322583view1turn322583view4
32
+
33
+ ## Local run
34
+
35
+ ```bash
36
+ pip install -r requirements.txt
37
+ streamlit run app.py
38
+ ```
39
+
40
+ ## How to set it up on Hugging Face
41
+
42
+ ### 1. Create two dataset repositories
43
+
44
+ Create:
45
+ - one dataset repo for the **source / seed data**
46
+ - one dataset repo for the **annotations**
47
+
48
+ Hugging Face dataset repositories are created from the Hub UI, and dataset files plus revision history are stored in the repository. citeturn322583view1
49
+
50
+ ### 2. Create a Space
51
+
52
+ Create a **Streamlit** Space and connect it to your GitHub repository. Spaces host apps directly on the Hub and support Streamlit as a built-in SDK. citeturn322583view2
53
+
54
+ ### 3. Attach a Storage Bucket
55
+
56
+ Attach a Storage Bucket to the Space and mount it at `/data`.
57
+
58
+ This is the only stateful storage used by the app. It stores drafts and cache files and survives restarts. Hugging Face documents Storage Buckets as the recommended persistence mechanism for Spaces. citeturn322583view0
59
+
60
+ ### 4. Add secrets
61
+
62
+ In the Space settings, add:
63
+ - `HF_TOKEN` — a Hugging Face token with **write** permission
64
+ - `SOURCE_DATASET_REPO`
65
+ - `SOURCE_DATASET_SPLIT`
66
+ - `ANNOTATION_REPO_ID`
67
+
68
+ Hugging Face recommends using Space secrets or environment variables instead of hard-coding sensitive values. A write token is required to create repositories or push content to the Hub. citeturn322583view2turn322583view4
69
+
70
+ ### 5. Deploy
71
+
72
+ Commit the repo to GitHub. Once the Space is linked, it will build from the repository, and the app can upload annotation files to the dataset repo using the Hub API. Hugging Face’s Hub client supports `upload_file()` and `create_commit()` for repository writes. citeturn322583view3turn322583view4
73
+
74
+ ## Suggested workflow for your group
75
+
76
+ - each person uses a stable annotator name
77
+ - each submission creates a new JSON file in the annotation repo
78
+ - the Review page shows items with 2+ annotations
79
+ - the Dashboard shows per-annotator and per-domain progress
80
+ - exports are generated from the merged source + annotation view
81
+
82
+ ## Why this is a good fit
83
+
84
+ The original source dataset can still be loaded with `datasets.load_dataset(...)`, and the Hugging Face ecosystem is designed for pushing and versioning datasets directly on the Hub. The `datasets` library also provides a `push_to_hub()` path for dataset publishing, while `huggingface_hub` provides lower-level file upload methods when you want more control over file layout. citeturn674332search1turn674332search3turn322583view3
hf-space/hf-space/hf-space/hf-space/hf-space/app.py ADDED
@@ -0,0 +1,853 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+
3
+ import json
4
+ import os
5
+ import uuid
6
+ from datetime import datetime, timezone
7
+ from pathlib import Path
8
+ from typing import Any, Dict, List, Optional, Tuple
9
+
10
+ import pandas as pd
11
+ import streamlit as st
12
+ from datasets import load_dataset
13
+ from huggingface_hub import HfApi, hf_hub_download
14
+
15
+ APP_TITLE = "🧭 LLM Annotation Platform"
16
+ DEFAULT_SOURCE_DATASET = os.environ.get(
17
+ "SOURCE_DATASET_REPO",
18
+ "nvidia/CantTalkAboutThis-Topic-Control-Dataset",
19
+ )
20
+ DEFAULT_SOURCE_SPLIT = os.environ.get("SOURCE_DATASET_SPLIT", "train")
21
+ DEFAULT_ANNOTATION_REPO = os.environ.get(
22
+ "ANNOTATION_REPO_ID",
23
+ "YOUR_ORG/llm-distractor-annotations",
24
+ )
25
+ DEFAULT_CACHE_DIR = Path(os.environ.get("CACHE_DIR", "/data/hf_annotation_cache"))
26
+ DEFAULT_CACHE_DIR.mkdir(parents=True, exist_ok=True)
27
+ LOCAL_DRAFT_DIR = Path(os.environ.get("DRAFT_DIR", "/data/hf_annotation_drafts"))
28
+ LOCAL_DRAFT_DIR.mkdir(parents=True, exist_ok=True)
29
+ LOCAL_EXPORT_DIR = Path(os.environ.get("EXPORT_DIR", "/data/hf_annotation_exports"))
30
+ LOCAL_EXPORT_DIR.mkdir(parents=True, exist_ok=True)
31
+
32
+ LABEL_OPTIONS = {
33
+ "distractor_kind": [
34
+ "benign off-topic",
35
+ "smooth bridge",
36
+ "policy-evasive",
37
+ "roleplay / impersonation",
38
+ "pressure / persistence",
39
+ "urgency / time pressure",
40
+ "loophole seeking",
41
+ "multi-turn escalation",
42
+ "other",
43
+ ],
44
+ "assistant_behavior": [
45
+ "perfect refusal + redirect",
46
+ "helpful redirection",
47
+ "partial engagement",
48
+ "full engagement / derailment",
49
+ "over-refusal",
50
+ "unclear",
51
+ ],
52
+ "transition_style": [
53
+ "abrupt",
54
+ "smooth bridge",
55
+ "follow-up clarification",
56
+ "rephrasing",
57
+ "escalation",
58
+ "roleplay",
59
+ "ambiguity exploitation",
60
+ "other",
61
+ ],
62
+ "policy_target": [
63
+ "medical advice",
64
+ "financial advice",
65
+ "legal advice",
66
+ "competitor discussion",
67
+ "politics",
68
+ "unsafe content",
69
+ "personal data",
70
+ "company-specific policy",
71
+ "tone / style policy",
72
+ "other",
73
+ ],
74
+ }
75
+
76
+
77
+ def now_iso() -> str:
78
+ return datetime.now(timezone.utc).isoformat()
79
+
80
+
81
+ def token() -> Optional[str]:
82
+ return os.environ.get("HF_TOKEN") or os.environ.get("HUGGINGFACE_HUB_TOKEN")
83
+
84
+
85
+ def api() -> HfApi:
86
+ return HfApi(token=token())
87
+
88
+
89
+ def annotation_file_name(item_id: str, annotator: str) -> str:
90
+ safe_annotator = "".join(ch if ch.isalnum() or ch in "-_." else "_" for ch in annotator.strip().lower()) or "annotator"
91
+ safe_item = "".join(ch if ch.isalnum() or ch in "-_." else "_" for ch in item_id.strip()) or "item"
92
+ stamp = datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ")
93
+ return f"annotations/{safe_annotator}/{stamp}_{safe_item}_{uuid.uuid4().hex[:8]}.json"
94
+
95
+
96
+ def draft_path(annotator: str) -> Path:
97
+ safe_annotator = "".join(ch if ch.isalnum() or ch in "-_." else "_" for ch in annotator.strip().lower()) or "annotator"
98
+ return LOCAL_DRAFT_DIR / f"{safe_annotator}.json"
99
+
100
+
101
+ def cache_annotations_dir() -> Path:
102
+ path = DEFAULT_CACHE_DIR / "annotations_snapshot"
103
+ path.mkdir(parents=True, exist_ok=True)
104
+ return path
105
+
106
+
107
+ def ensure_repo_exists(repo_id: str) -> None:
108
+ if repo_id.startswith("YOUR_ORG/") or not repo_id.strip():
109
+ return
110
+ api().create_repo(repo_id=repo_id, repo_type="dataset", private=True, exist_ok=True)
111
+
112
+
113
+ def load_source_dataset(repo_id: str, split: str) -> List[Dict[str, Any]]:
114
+ ds = load_dataset(repo_id, split=split)
115
+ return [dict(row) for row in ds]
116
+
117
+
118
+ def normalize_turns(turns: Any) -> List[Dict[str, Any]]:
119
+ if turns is None:
120
+ return []
121
+ if isinstance(turns, str):
122
+ try:
123
+ turns = json.loads(turns)
124
+ except Exception:
125
+ return []
126
+ if not isinstance(turns, list):
127
+ return []
128
+ out = []
129
+ for turn in turns:
130
+ if isinstance(turn, dict):
131
+ role = turn.get("role") or turn.get("speaker") or turn.get("type") or "unknown"
132
+ content = turn.get("content") or turn.get("text") or turn.get("utterance") or ""
133
+ out.append({"role": str(role), "content": str(content)})
134
+ else:
135
+ out.append({"role": "unknown", "content": str(turn)})
136
+ return out
137
+
138
+
139
+ def safe_sample_id(record: Dict[str, Any], fallback_index: int) -> str:
140
+ for key in ("sample_id", "id", "_id", "row_id"):
141
+ if record.get(key) not in (None, ""):
142
+ return str(record[key])
143
+ domain = str(record.get("domain", "sample")).replace(" ", "_")
144
+ scenario = str(record.get("scenario", "")).replace(" ", "_")
145
+ return f"{domain}-{scenario}-{fallback_index}"
146
+
147
+
148
+ def expand_record(record: Dict[str, Any], idx: int) -> Tuple[Dict[str, Any], List[Dict[str, Any]]]:
149
+ sample_id = safe_sample_id(record, idx)
150
+ conversation = normalize_turns(record.get("conversation"))
151
+ distractors = record.get("distractors") or []
152
+ if isinstance(distractors, str):
153
+ try:
154
+ distractors = json.loads(distractors)
155
+ except Exception:
156
+ distractors = []
157
+ if not isinstance(distractors, list):
158
+ distractors = []
159
+
160
+ sample = {
161
+ "sample_id": sample_id,
162
+ "domain": str(record.get("domain", "")),
163
+ "scenario": str(record.get("scenario", "")),
164
+ "system_instruction": str(record.get("system_instruction", "")),
165
+ "conversation_json": json.dumps(conversation, ensure_ascii=False),
166
+ "distractors_json": json.dumps(distractors, ensure_ascii=False),
167
+ "conversation_with_distractors_json": json.dumps(record.get("conversation_with_distractors", []), ensure_ascii=False),
168
+ "raw_json": json.dumps(record, ensure_ascii=False),
169
+ }
170
+
171
+ items = []
172
+ for distractor_index, d in enumerate(distractors):
173
+ bot_turn = ""
174
+ distractor_text = ""
175
+ if isinstance(d, dict):
176
+ bot_turn = str(
177
+ d.get("bot turn")
178
+ or d.get("bot_turn")
179
+ or d.get("assistant_turn")
180
+ or d.get("assistant")
181
+ or ""
182
+ )
183
+ distractor_text = str(
184
+ d.get("distractor")
185
+ or d.get("distractor user turn")
186
+ or d.get("user_turn")
187
+ or d.get("user")
188
+ or d.get("text")
189
+ or ""
190
+ )
191
+ else:
192
+ distractor_text = str(d)
193
+
194
+ items.append(
195
+ {
196
+ "item_id": f"{sample_id}::{distractor_index}",
197
+ "sample_id": sample_id,
198
+ "distractor_index": distractor_index,
199
+ "bot_turn": bot_turn,
200
+ "distractor_text": distractor_text,
201
+ }
202
+ )
203
+ return sample, items
204
+
205
+
206
+ def seed_source_index(records: List[Dict[str, Any]]) -> Tuple[pd.DataFrame, pd.DataFrame]:
207
+ samples = []
208
+ items = []
209
+ for idx, record in enumerate(records):
210
+ sample, record_items = expand_record(record, idx)
211
+ samples.append(sample)
212
+ items.extend(record_items)
213
+ return pd.DataFrame(samples), pd.DataFrame(items)
214
+
215
+
216
+ def read_json_file(path: Path) -> Dict[str, Any]:
217
+ with path.open("r", encoding="utf-8") as f:
218
+ return json.load(f)
219
+
220
+
221
+ def load_all_hub_annotations(annotation_repo_id: str) -> pd.DataFrame:
222
+ """
223
+ Each submission is stored as a separate JSON file, which avoids write conflicts.
224
+ """
225
+ if annotation_repo_id.startswith("YOUR_ORG/") or not annotation_repo_id.strip():
226
+ return pd.DataFrame(columns=["item_id", "annotator", "labels", "notes", "status", "created_at", "file_path"])
227
+
228
+ cache_dir = cache_annotations_dir()
229
+ file_list = api().list_repo_files(annotation_repo_id, repo_type="dataset")
230
+ ann_files = [f for f in file_list if f.startswith("annotations/") and f.endswith(".json")]
231
+
232
+ rows = []
233
+ for file_path in ann_files:
234
+ try:
235
+ local_path = hf_hub_download(
236
+ repo_id=annotation_repo_id,
237
+ repo_type="dataset",
238
+ filename=file_path,
239
+ token=token(),
240
+ local_dir=str(cache_dir),
241
+ local_dir_use_symlinks=False,
242
+ )
243
+ payload = read_json_file(Path(local_path))
244
+ rows.append(
245
+ {
246
+ "item_id": payload.get("item_id", ""),
247
+ "sample_id": payload.get("sample_id", ""),
248
+ "annotator": payload.get("annotator", ""),
249
+ "labels": payload.get("labels", {}),
250
+ "notes": payload.get("notes", ""),
251
+ "status": payload.get("status", "submitted"),
252
+ "created_at": payload.get("created_at", ""),
253
+ "file_path": file_path,
254
+ }
255
+ )
256
+ except Exception as e:
257
+ rows.append(
258
+ {
259
+ "item_id": "",
260
+ "sample_id": "",
261
+ "annotator": "",
262
+ "labels": {},
263
+ "notes": f"Failed to load {file_path}: {e}",
264
+ "status": "load_error",
265
+ "created_at": "",
266
+ "file_path": file_path,
267
+ }
268
+ )
269
+
270
+ return pd.DataFrame(rows) if rows else pd.DataFrame(columns=["item_id", "sample_id", "annotator", "labels", "notes", "status", "created_at", "file_path"])
271
+
272
+
273
+ def save_draft(annotator: str, payload: Dict[str, Any]) -> Path:
274
+ path = draft_path(annotator)
275
+ path.parent.mkdir(parents=True, exist_ok=True)
276
+ with path.open("w", encoding="utf-8") as f:
277
+ json.dump(payload, f, ensure_ascii=False, indent=2)
278
+ return path
279
+
280
+
281
+ def load_draft(annotator: str) -> Dict[str, Any]:
282
+ path = draft_path(annotator)
283
+ if not path.exists():
284
+ return {}
285
+ try:
286
+ return read_json_file(path)
287
+ except Exception:
288
+ return {}
289
+
290
+
291
+ def build_labels_from_state(prefix: str = "") -> Dict[str, Any]:
292
+ return {
293
+ "distractor_kind": st.session_state.get(f"{prefix}distractor_kind", LABEL_OPTIONS["distractor_kind"][0]),
294
+ "transition_style": st.session_state.get(f"{prefix}transition_style", LABEL_OPTIONS["transition_style"][0]),
295
+ "policy_target": st.session_state.get(f"{prefix}policy_target", []),
296
+ "difficulty": int(st.session_state.get(f"{prefix}difficulty", 3)),
297
+ "realism": int(st.session_state.get(f"{prefix}realism", 3)),
298
+ "assistant_behavior": st.session_state.get(f"{prefix}assistant_behavior", LABEL_OPTIONS["assistant_behavior"][0]),
299
+ "multi_turn_escalation": bool(st.session_state.get(f"{prefix}multi_turn_escalation", False)),
300
+ "rule_followed": bool(st.session_state.get(f"{prefix}rule_followed", True)),
301
+ "needs_review": bool(st.session_state.get(f"{prefix}needs_review", False)),
302
+ "confidence": int(st.session_state.get(f"{prefix}confidence", 3)),
303
+ }
304
+
305
+
306
+ def preview_text(text: str, limit: int = 280) -> str:
307
+ txt = (text or "").strip().replace("\n", " ")
308
+ if len(txt) <= limit:
309
+ return txt
310
+ return txt[:limit - 1] + "…"
311
+
312
+
313
+ def render_turns(turns: List[Dict[str, Any]]) -> None:
314
+ if not turns:
315
+ st.info("No conversation turns found.")
316
+ return
317
+ for i, turn in enumerate(turns, 1):
318
+ role = str(turn.get("role", "unknown")).lower()
319
+ content = str(turn.get("content", "")).strip()
320
+ css_cls = "user" if role == "user" else "assistant" if role in {"assistant", "bot"} else "system"
321
+ st.markdown(
322
+ f"""
323
+ <div class="turn {css_cls}">
324
+ <span class="badge">{role.upper()}</span>
325
+ <span class="smallmono">Turn {i}</span>
326
+ <div style="margin-top:0.35rem; white-space:pre-wrap;">{content.replace(chr(10), '<br>')}</div>
327
+ </div>
328
+ """,
329
+ unsafe_allow_html=True,
330
+ )
331
+
332
+
333
+ def annotation_exists_for_item(df_anns: pd.DataFrame, item_id: str, annotator: str) -> bool:
334
+ if df_anns.empty:
335
+ return False
336
+ sub = df_anns[(df_anns["item_id"] == item_id) & (df_anns["annotator"] == annotator)]
337
+ return not sub.empty
338
+
339
+
340
+ def compute_agreement(df_anns: pd.DataFrame, label_key: str = "assistant_behavior") -> Dict[str, Any]:
341
+ if df_anns.empty:
342
+ return {"paired_items": 0, "raw_agreement": None, "cohen_kappa": None}
343
+
344
+ rows = []
345
+ for _, r in df_anns.iterrows():
346
+ labels = r.get("labels", {}) or {}
347
+ rows.append({"item_id": r["item_id"], "annotator": r["annotator"], label_key: labels.get(label_key)})
348
+ tmp = pd.DataFrame(rows)
349
+ pivot = tmp.pivot_table(index="item_id", columns="annotator", values=label_key, aggfunc="first")
350
+ pivot = pivot.dropna(axis=0, how="any")
351
+ if pivot.shape[0] < 2 or pivot.shape[1] < 2:
352
+ return {"paired_items": int(pivot.shape[0]), "raw_agreement": None, "cohen_kappa": None}
353
+
354
+ from sklearn.metrics import cohen_kappa_score
355
+
356
+ a = pivot.iloc[:, 0].astype(str)
357
+ b = pivot.iloc[:, 1].astype(str)
358
+ return {
359
+ "paired_items": int(pivot.shape[0]),
360
+ "raw_agreement": float((a == b).mean()),
361
+ "cohen_kappa": float(cohen_kappa_score(a, b)),
362
+ }
363
+
364
+
365
+ def push_annotation_to_hub(annotation_repo_id: str, payload: Dict[str, Any]) -> str:
366
+ ensure_repo_exists(annotation_repo_id)
367
+ file_rel_path = annotation_file_name(payload["item_id"], payload["annotator"])
368
+ local_path = LOCAL_DRAFT_DIR / file_rel_path.replace("/", "__")
369
+ local_path.parent.mkdir(parents=True, exist_ok=True)
370
+ with local_path.open("w", encoding="utf-8") as f:
371
+ json.dump(payload, f, ensure_ascii=False, indent=2)
372
+
373
+ api().upload_file(
374
+ path_or_fileobj=str(local_path),
375
+ path_in_repo=file_rel_path,
376
+ repo_id=annotation_repo_id,
377
+ repo_type="dataset",
378
+ token=token(),
379
+ commit_message=f"Add annotation for {payload['item_id']} by {payload['annotator']}",
380
+ )
381
+ return file_rel_path
382
+
383
+
384
+ def get_current_item_id() -> Optional[str]:
385
+ return st.session_state.get("current_item_id")
386
+
387
+
388
+ def set_current_item_id(item_id: Optional[str]) -> None:
389
+ st.session_state["current_item_id"] = item_id
390
+ try:
391
+ st.query_params["item_id"] = item_id or ""
392
+ except Exception:
393
+ pass
394
+
395
+
396
+ def main() -> None:
397
+ st.set_page_config(page_title="LLM Annotation Platform", page_icon="🧭", layout="wide")
398
+ st.markdown(
399
+ """
400
+ <style>
401
+ .block-container {padding-top: 1rem; padding-bottom: 2rem;}
402
+ .smallmono {font-size: 0.84rem; font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", monospace;}
403
+ .cardbox {
404
+ border: 1px solid rgba(120,120,120,0.22);
405
+ border-radius: 18px;
406
+ padding: 1rem 1rem 0.75rem 1rem;
407
+ background: rgba(255,255,255,0.03);
408
+ }
409
+ .turn {
410
+ border-left: 4px solid rgba(120,120,120,0.45);
411
+ padding: 0.6rem 0.85rem;
412
+ margin: 0.55rem 0;
413
+ border-radius: 0.6rem;
414
+ background: rgba(128,128,128,0.06);
415
+ }
416
+ .turn.user {border-left-color: #8b5cf6;}
417
+ .turn.assistant, .turn.bot {border-left-color: #06b6d4;}
418
+ .turn.system {border-left-color: #f59e0b;}
419
+ .badge {
420
+ display:inline-block; padding:0.18rem 0.5rem; border-radius: 999px;
421
+ background: rgba(120,120,120,0.16); margin-right: 0.35rem; font-size: 0.78rem;
422
+ }
423
+ hr {margin: 0.7rem 0 0.9rem 0;}
424
+ </style>
425
+ """,
426
+ unsafe_allow_html=True,
427
+ )
428
+
429
+ st.title(APP_TITLE)
430
+ st.caption("A Hugging Face–native annotation tool for multi-turn distractors, inter-rater review, and dataset versioning.")
431
+
432
+ if "annotator" not in st.session_state:
433
+ st.session_state["annotator"] = "annotator_1"
434
+ if "current_item_id" not in st.session_state:
435
+ st.session_state["current_item_id"] = None
436
+ if "source_records" not in st.session_state:
437
+ st.session_state["source_records"] = None
438
+ if "source_index" not in st.session_state:
439
+ st.session_state["source_index"] = None
440
+ if "annotations_df" not in st.session_state:
441
+ st.session_state["annotations_df"] = None
442
+ if "draft_loaded" not in st.session_state:
443
+ st.session_state["draft_loaded"] = False
444
+
445
+ with st.sidebar:
446
+ st.header("Workspace")
447
+ annotator = st.text_input("Annotator name", value=st.session_state["annotator"])
448
+ st.session_state["annotator"] = annotator.strip() or "annotator_1"
449
+
450
+ source_repo = st.text_input("Source dataset repo", value=DEFAULT_SOURCE_DATASET)
451
+ source_split = st.text_input("Source split", value=DEFAULT_SOURCE_SPLIT)
452
+ annotation_repo = st.text_input("Annotation dataset repo", value=DEFAULT_ANNOTATION_REPO)
453
+
454
+ st.divider()
455
+ st.caption("HF token is needed only for upload / repo creation.")
456
+ st.write("HF token present:", "yes" if token() else "no")
457
+ st.write("Cache:", str(DEFAULT_CACHE_DIR))
458
+ st.write("Drafts:", str(LOCAL_DRAFT_DIR))
459
+
460
+ if st.button("Reload Hub data", use_container_width=True):
461
+ st.session_state["source_records"] = None
462
+ st.session_state["source_index"] = None
463
+ st.session_state["annotations_df"] = None
464
+ st.rerun()
465
+
466
+ page = st.radio("Page", ["Annotate", "Review", "Dashboard", "Export"], index=0)
467
+
468
+ if st.session_state["source_records"] is None:
469
+ with st.spinner("Loading source dataset from the Hub..."):
470
+ source_records = load_source_dataset(source_repo, source_split)
471
+ samples_df, items_df = seed_source_index(source_records)
472
+ st.session_state["source_records"] = source_records
473
+ st.session_state["source_index"] = {"samples_df": samples_df, "items_df": items_df}
474
+
475
+ if st.session_state["annotations_df"] is None:
476
+ with st.spinner("Loading annotations from the annotation dataset repo..."):
477
+ try:
478
+ anns_df = load_all_hub_annotations(annotation_repo)
479
+ except Exception as e:
480
+ anns_df = pd.DataFrame(columns=["item_id", "sample_id", "annotator", "labels", "notes", "status", "created_at", "file_path"])
481
+ st.warning(f"Could not load annotations from Hub yet: {e}")
482
+ st.session_state["annotations_df"] = anns_df
483
+
484
+ samples_df = st.session_state["source_index"]["samples_df"]
485
+ items_df = st.session_state["source_index"]["items_df"]
486
+ anns_df = st.session_state["annotations_df"]
487
+
488
+ if not st.session_state["draft_loaded"]:
489
+ try:
490
+ q_item = st.query_params.get("item_id")
491
+ except Exception:
492
+ q_item = None
493
+ if q_item:
494
+ st.session_state["current_item_id"] = q_item
495
+ draft = load_draft(st.session_state["annotator"])
496
+ if draft.get("current_item_id") and not st.session_state["current_item_id"]:
497
+ st.session_state["current_item_id"] = draft["current_item_id"]
498
+ st.session_state["draft_loaded"] = True
499
+
500
+ my_annotated_item_ids = set(
501
+ anns_df.loc[anns_df["annotator"] == st.session_state["annotator"], "item_id"].dropna().astype(str).tolist()
502
+ ) if not anns_df.empty else set()
503
+
504
+ def current_item_row() -> Optional[Dict[str, Any]]:
505
+ item_id = get_current_item_id()
506
+ if not item_id:
507
+ return None
508
+ match = items_df[items_df["item_id"] == item_id]
509
+ if match.empty:
510
+ return None
511
+ row = match.iloc[0].to_dict()
512
+ sample = samples_df[samples_df["sample_id"] == row["sample_id"]]
513
+ if not sample.empty:
514
+ row.update(sample.iloc[0].to_dict())
515
+ return row
516
+
517
+ def queue_df() -> pd.DataFrame:
518
+ return items_df[~items_df["item_id"].astype(str).isin(my_annotated_item_ids)].copy()
519
+
520
+ if page == "Annotate":
521
+ st.subheader("Annotate a distractor item")
522
+ left, right = st.columns([1.05, 0.95], gap="large")
523
+
524
+ with left:
525
+ top_a, top_b, top_c = st.columns([1, 1, 1])
526
+ with top_a:
527
+ if st.button("Claim next item", use_container_width=True):
528
+ q = queue_df()
529
+ if q.empty:
530
+ st.warning("No remaining items in your queue.")
531
+ else:
532
+ set_current_item_id(q.iloc[0]["item_id"])
533
+ st.rerun()
534
+ with top_b:
535
+ if st.button("Reload annotations from Hub", use_container_width=True):
536
+ st.session_state["annotations_df"] = load_all_hub_annotations(annotation_repo)
537
+ st.rerun()
538
+ with top_c:
539
+ if st.button("Clear current", use_container_width=True):
540
+ set_current_item_id(None)
541
+ st.rerun()
542
+
543
+ item = current_item_row()
544
+ if item is None:
545
+ st.info("Claim an item to start. The app keeps a per-annotator queue so multiple people can work in parallel.")
546
+ q = queue_df().head(10)
547
+ if not q.empty:
548
+ display = q[["item_id", "sample_id", "domain", "scenario", "distractor_index"]].copy()
549
+ display["preview"] = q["distractor_text"].map(preview_text)
550
+ st.dataframe(display, use_container_width=True, hide_index=True)
551
+ return
552
+
553
+ st.markdown(
554
+ f"""
555
+ <div class="cardbox">
556
+ <div><span class="badge">Domain</span> {item.get("domain", "")}</div>
557
+ <div style="margin-top:0.35rem;"><span class="badge">Scenario</span> {item.get("scenario", "")}</div>
558
+ <div style="margin-top:0.35rem;"><span class="badge">Sample</span> <span class="smallmono">{item.get("sample_id", "")}</span></div>
559
+ <div style="margin-top:0.35rem;"><span class="badge">Item</span> <span class="smallmono">{item.get("item_id", "")}</span></div>
560
+ </div>
561
+ """,
562
+ unsafe_allow_html=True,
563
+ )
564
+ st.divider()
565
+
566
+ tabs = st.tabs(["Context", "Distractor", "Existing annotations"])
567
+ with tabs[0]:
568
+ st.markdown("**System instruction**")
569
+ st.code(item.get("system_instruction", ""), language="text")
570
+ st.markdown("**Conversation**")
571
+ render_turns(json.loads(item.get("conversation_json", "[]")))
572
+ with tabs[1]:
573
+ st.markdown("**Previous assistant turn**")
574
+ st.code(item.get("bot_turn", "") or "(missing)", language="text")
575
+ st.markdown("**Distractor user turn**")
576
+ st.code(item.get("distractor_text", "") or "(missing)", language="text")
577
+ with tabs[2]:
578
+ existing = anns_df[anns_df["item_id"] == item["item_id"]].copy()
579
+ if existing.empty:
580
+ st.caption("No annotations yet.")
581
+ else:
582
+ for _, row in existing.iterrows():
583
+ st.write(f"**{row['annotator']}** · {row['status']} · {row['created_at']}")
584
+ st.json(row["labels"])
585
+ if row.get("notes"):
586
+ st.caption(row["notes"])
587
+ st.divider()
588
+
589
+ with right:
590
+ st.markdown("### Annotation form")
591
+ current_draft = load_draft(st.session_state["annotator"])
592
+ draft_labels = current_draft.get("labels", {}) if current_draft else {}
593
+
594
+ with st.form("annotation_form", clear_on_submit=False):
595
+ st.selectbox(
596
+ "Distractor kind",
597
+ LABEL_OPTIONS["distractor_kind"],
598
+ index=LABEL_OPTIONS["distractor_kind"].index(draft_labels.get("distractor_kind", LABEL_OPTIONS["distractor_kind"][0]))
599
+ if draft_labels.get("distractor_kind") in LABEL_OPTIONS["distractor_kind"]
600
+ else 0,
601
+ key="distractor_kind",
602
+ )
603
+ st.selectbox(
604
+ "Transition style",
605
+ LABEL_OPTIONS["transition_style"],
606
+ index=LABEL_OPTIONS["transition_style"].index(draft_labels.get("transition_style", LABEL_OPTIONS["transition_style"][0]))
607
+ if draft_labels.get("transition_style") in LABEL_OPTIONS["transition_style"]
608
+ else 0,
609
+ key="transition_style",
610
+ )
611
+ st.multiselect(
612
+ "Policy target(s)",
613
+ LABEL_OPTIONS["policy_target"],
614
+ default=draft_labels.get("policy_target", []),
615
+ key="policy_target",
616
+ )
617
+ c1, c2 = st.columns(2)
618
+ with c1:
619
+ st.slider("Difficulty", 1, 5, value=int(draft_labels.get("difficulty", 3)), key="difficulty")
620
+ st.slider("Realism", 1, 5, value=int(draft_labels.get("realism", 3)), key="realism")
621
+ with c2:
622
+ st.selectbox(
623
+ "Assistant behavior",
624
+ LABEL_OPTIONS["assistant_behavior"],
625
+ index=LABEL_OPTIONS["assistant_behavior"].index(draft_labels.get("assistant_behavior", LABEL_OPTIONS["assistant_behavior"][0]))
626
+ if draft_labels.get("assistant_behavior") in LABEL_OPTIONS["assistant_behavior"]
627
+ else 0,
628
+ key="assistant_behavior",
629
+ )
630
+ st.slider("Confidence", 1, 5, value=int(draft_labels.get("confidence", 3)), key="confidence")
631
+
632
+ st.checkbox(
633
+ "Multi-turn escalation / persistence",
634
+ value=bool(draft_labels.get("multi_turn_escalation", False)),
635
+ key="multi_turn_escalation",
636
+ )
637
+ st.checkbox(
638
+ "Assistant followed the rule",
639
+ value=bool(draft_labels.get("rule_followed", True)),
640
+ key="rule_followed",
641
+ )
642
+ st.checkbox(
643
+ "Borderline / needs review",
644
+ value=bool(draft_labels.get("needs_review", False)),
645
+ key="needs_review",
646
+ )
647
+ notes = st.text_area(
648
+ "Notes",
649
+ value=current_draft.get("notes", ""),
650
+ height=150,
651
+ placeholder="Explain ambiguity, likely disagreement, or policy edge cases.",
652
+ )
653
+ submitted = st.form_submit_button("Submit to Hugging Face", use_container_width=True)
654
+
655
+ c1, c2 = st.columns(2)
656
+ with c1:
657
+ if st.button("Save draft locally", use_container_width=True):
658
+ payload = {
659
+ "current_item_id": item["item_id"],
660
+ "labels": build_labels_from_state(),
661
+ "notes": notes,
662
+ "saved_at": now_iso(),
663
+ }
664
+ path = save_draft(st.session_state["annotator"], payload)
665
+ st.success(f"Draft saved to {path}")
666
+ with c2:
667
+ if st.button("Sync annotation cache", use_container_width=True):
668
+ st.session_state["annotations_df"] = load_all_hub_annotations(annotation_repo)
669
+ st.success("Reloaded annotation index from Hub.")
670
+
671
+ if submitted:
672
+ labels = build_labels_from_state()
673
+ payload = {
674
+ "annotation_id": str(uuid.uuid4()),
675
+ "item_id": item["item_id"],
676
+ "sample_id": item["sample_id"],
677
+ "annotator": st.session_state["annotator"],
678
+ "created_at": now_iso(),
679
+ "status": "submitted",
680
+ "labels": labels,
681
+ "notes": notes,
682
+ "source": {
683
+ "source_dataset_repo": source_repo,
684
+ "source_dataset_split": source_split,
685
+ "domain": item.get("domain", ""),
686
+ "scenario": item.get("scenario", ""),
687
+ "distractor_index": int(item.get("distractor_index", 0)),
688
+ },
689
+ }
690
+ try:
691
+ path_in_repo = push_annotation_to_hub(annotation_repo, payload)
692
+ st.session_state["annotations_df"] = pd.concat(
693
+ [
694
+ anns_df,
695
+ pd.DataFrame(
696
+ [
697
+ {
698
+ "item_id": payload["item_id"],
699
+ "sample_id": payload["sample_id"],
700
+ "annotator": payload["annotator"],
701
+ "labels": payload["labels"],
702
+ "notes": payload["notes"],
703
+ "status": payload["status"],
704
+ "created_at": payload["created_at"],
705
+ "file_path": path_in_repo,
706
+ }
707
+ ]
708
+ ),
709
+ ],
710
+ ignore_index=True,
711
+ )
712
+ save_draft(
713
+ st.session_state["annotator"],
714
+ {
715
+ "current_item_id": item["item_id"],
716
+ "labels": labels,
717
+ "notes": notes,
718
+ "saved_at": now_iso(),
719
+ },
720
+ )
721
+ st.success(f"Submitted to Hugging Face as {path_in_repo}")
722
+ q = queue_df()
723
+ if not q.empty:
724
+ set_current_item_id(q.iloc[0]["item_id"])
725
+ st.rerun()
726
+ except Exception as e:
727
+ st.error(f"Upload failed. Saved locally only. Error: {e}")
728
+ save_draft(
729
+ st.session_state["annotator"],
730
+ {
731
+ "current_item_id": item["item_id"],
732
+ "labels": labels,
733
+ "notes": notes,
734
+ "saved_at": now_iso(),
735
+ },
736
+ )
737
+
738
+ st.caption("Each submission is a separate file in the annotation dataset repo, so multiple annotators can work in parallel without write conflicts.")
739
+
740
+ elif page == "Review":
741
+ st.subheader("Inter-rater review")
742
+ multi = (
743
+ anns_df.groupby("item_id")["annotator"].nunique().reset_index(name="n_annotators")
744
+ if not anns_df.empty
745
+ else pd.DataFrame(columns=["item_id", "n_annotators"])
746
+ )
747
+ multi = multi[multi["n_annotators"] >= 2] if not multi.empty else multi
748
+
749
+ if multi.empty:
750
+ st.info("No items with at least two annotations yet.")
751
+ else:
752
+ selected_item = st.selectbox("Item with multiple annotations", multi["item_id"].tolist())
753
+ row = items_df[items_df["item_id"] == selected_item].iloc[0].to_dict()
754
+ sample = samples_df[samples_df["sample_id"] == row["sample_id"]].iloc[0].to_dict()
755
+ row.update(sample)
756
+
757
+ st.markdown("### Context")
758
+ st.code(row["system_instruction"], language="text")
759
+ st.code(row["bot_turn"] or "", language="text")
760
+ st.code(row["distractor_text"] or "", language="text")
761
+
762
+ st.markdown("### Annotations")
763
+ sub = anns_df[anns_df["item_id"] == selected_item].copy()
764
+ cols = st.columns(min(len(sub), 3)) if len(sub) > 0 else st.columns(1)
765
+ for idx, (_, ann) in enumerate(sub.iterrows()):
766
+ with cols[idx % len(cols)]:
767
+ st.write(f"**{ann['annotator']}**")
768
+ st.caption(f"{ann['status']} · {ann['created_at']}")
769
+ st.json(ann["labels"])
770
+ if ann.get("notes"):
771
+ st.caption(ann["notes"])
772
+
773
+ agreement = compute_agreement(sub, label_key="assistant_behavior")
774
+ c1, c2, c3 = st.columns(3)
775
+ c1.metric("Paired items", agreement["paired_items"])
776
+ c2.metric("Raw agreement", f"{agreement['raw_agreement']:.2%}" if agreement["raw_agreement"] is not None else "n/a")
777
+ c3.metric("Cohen's κ", f"{agreement['cohen_kappa']:.3f}" if agreement["cohen_kappa"] is not None else "n/a")
778
+
779
+ elif page == "Dashboard":
780
+ st.subheader("Dashboard")
781
+ c1, c2, c3, c4 = st.columns(4)
782
+ c1.metric("Source samples", len(samples_df))
783
+ c2.metric("Source items", len(items_df))
784
+ c3.metric("Annotation files", len(anns_df))
785
+ c4.metric("My queue", len(queue_df()))
786
+
787
+ st.markdown("### Progress by annotator")
788
+ if anns_df.empty:
789
+ st.info("No annotations yet.")
790
+ else:
791
+ by_ann = anns_df.groupby("annotator")["item_id"].nunique().reset_index(name="annotated_items").sort_values("annotated_items", ascending=False)
792
+ st.dataframe(by_ann, use_container_width=True, hide_index=True)
793
+
794
+ st.markdown("### Progress by domain")
795
+ joined = anns_df.merge(items_df[["item_id", "domain"]], on="item_id", how="left")
796
+ by_domain = joined.groupby("domain")["item_id"].nunique().reset_index(name="annotated_items").sort_values("annotated_items", ascending=False)
797
+ st.dataframe(by_domain, use_container_width=True, hide_index=True)
798
+
799
+ st.markdown("### Agreement snapshot")
800
+ metric = compute_agreement(anns_df, label_key="assistant_behavior")
801
+ st.write(metric)
802
+
803
+ st.markdown("### Recent annotation previews")
804
+ recent = anns_df.sort_values("created_at", ascending=False).head(20).copy()
805
+ if "labels" in recent.columns:
806
+ recent["assistant_behavior"] = recent["labels"].apply(lambda x: x.get("assistant_behavior") if isinstance(x, dict) else None)
807
+ recent["distractor_kind"] = recent["labels"].apply(lambda x: x.get("distractor_kind") if isinstance(x, dict) else None)
808
+ st.dataframe(
809
+ recent[["annotator", "item_id", "status", "created_at", "assistant_behavior", "distractor_kind", "notes"]],
810
+ use_container_width=True,
811
+ hide_index=True,
812
+ )
813
+
814
+ else:
815
+ st.subheader("Export")
816
+ st.write("Export the merged dataset for downstream analysis or model training.")
817
+
818
+ merged = items_df.merge(samples_df, on="sample_id", how="left")
819
+ if not anns_df.empty:
820
+ export_df = merged.merge(anns_df[["item_id", "annotator", "labels", "notes", "status", "created_at"]], on="item_id", how="left")
821
+ else:
822
+ export_df = merged.copy()
823
+ export_df["annotator"] = None
824
+ export_df["labels"] = None
825
+ export_df["notes"] = None
826
+ export_df["status"] = None
827
+ export_df["created_at"] = None
828
+
829
+ c1, c2 = st.columns(2)
830
+ with c1:
831
+ jsonl = LOCAL_EXPORT_DIR / "annotations_export.jsonl"
832
+ if st.button("Generate JSONL export", use_container_width=True):
833
+ with jsonl.open("w", encoding="utf-8") as f:
834
+ for _, r in export_df.iterrows():
835
+ f.write(json.dumps(r.where(pd.notna(r), None).to_dict(), ensure_ascii=False) + "\n")
836
+ st.success(f"Wrote {jsonl}")
837
+ st.download_button("Download JSONL", jsonl.read_text(encoding="utf-8"), file_name=jsonl.name, mime="application/json")
838
+ with c2:
839
+ csv = LOCAL_EXPORT_DIR / "annotations_export.csv"
840
+ if st.button("Generate CSV export", use_container_width=True):
841
+ export_df.to_csv(csv, index=False)
842
+ st.success(f"Wrote {csv}")
843
+ st.download_button("Download CSV", csv.read_text(encoding="utf-8"), file_name=csv.name, mime="text/csv")
844
+
845
+ st.markdown("### Repository handoff")
846
+ st.code(
847
+ f"Source repo: {source_repo}\nAnnotation repo: {annotation_repo}\nSplit: {source_split}\nAnnotator: {st.session_state['annotator']}",
848
+ language="text",
849
+ )
850
+
851
+
852
+ if __name__ == "__main__":
853
+ main()
hf-space/hf-space/hf-space/hf-space/hf-space/hf-space/.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
hf-space/hf-space/hf-space/hf-space/hf-space/hf-space/README.md ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Llm Annotation Platform
3
+ emoji: 🦀
4
+ colorFrom: indigo
5
+ colorTo: blue
6
+ sdk: docker
7
+ pinned: false
8
+ ---
9
+
10
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
hf-space/hf-space/hf-space/hf-space/hf-space/requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ streamlit>=1.37
2
+ pandas>=2.2
3
+ datasets>=2.21
4
+ huggingface_hub>=0.24
5
+ scikit-learn>=1.5
hf-space/hf-space/hf-space/hf-space/hf-space/scripts/seed.py ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+
3
+ import argparse
4
+
5
+ from datasets import load_dataset
6
+
7
+ from app import DEFAULT_ANNOTATION_REPO, DEFAULT_SOURCE_DATASET, DEFAULT_SOURCE_SPLIT
8
+
9
+
10
+ def main() -> None:
11
+ parser = argparse.ArgumentParser()
12
+ parser.add_argument("--source", default=DEFAULT_SOURCE_DATASET)
13
+ parser.add_argument("--split", default=DEFAULT_SOURCE_SPLIT)
14
+ parser.add_argument("--annotation-repo", default=DEFAULT_ANNOTATION_REPO)
15
+ parser.add_argument("--limit", type=int, default=0)
16
+ args = parser.parse_args()
17
+
18
+ records = load_dataset(args.source, split=args.split)
19
+ if args.limit:
20
+ records = records.select(range(min(len(records), args.limit)))
21
+
22
+ print(f"Loaded {len(records)} source records from {args.source}/{args.split}")
23
+ print(f"Annotation repo: {args.annotation_repo}")
24
+ print("Open the Streamlit app and submit annotations there.")
25
+
26
+
27
+ if __name__ == "__main__":
28
+ main()
hf-space/requirements.txt CHANGED
@@ -1,5 +1,5 @@
1
- streamlit>=1.37
2
  pandas>=2.2
3
  datasets>=2.21
4
  huggingface_hub>=0.24
5
- scikit-learn>=1.5
 
1
+ streamlit>=1.38
2
  pandas>=2.2
3
  datasets>=2.21
4
  huggingface_hub>=0.24
5
+ openai>=1.40