Spaces:

keepingLLMontrack
/

llm-annotation-platform

Running

GitHub Actions

Sync from GitHub

0be7900 4 days ago

2.18 kB

A newer version of the Streamlit SDK is available: 1.58.0

title: Distractor Annotation Tool
emoji: 🎯
colorFrom: purple
colorTo: indigo
sdk: streamlit
sdk_version: 1.36.0
app_file: app.py
pinned: false

🎯 Distractor Annotation Tool

Collaborative annotation GUI for the MSc NLP research project "Keeping LLMs on Track in Task-Oriented Dialogue".

Go to huggingface.co/new-dataset, make it private, and note the repo ID (e.g. yourgroup/distractor-annotations).

In your Space → Settings → Repository secrets, add:

Secret	Value
`HF_TOKEN`	Your HF token with write access
`ANNOTATIONS_REPO_ID`	e.g. `yourgroup/distractor-annotations`

In GitHub → Settings → Secrets and variables → Actions, add the same HF_TOKEN.

In .github/workflows/sync_to_hf.yml, replace YOUR_HF_USERNAME and YOUR_SPACE_NAME with your actual values.

On first run, go to the Dashboard and click Import Seed Data to populate the shared repo with the group's initial entries.

Page	Purpose
🏠 Dashboard	Stats overview, seed import, config check
📚 Browse	Explore the base nvidia dataset and seed entries
✏️ Annotate	Create multi-turn distractor entries
👥 Annotations	View, edit, review all group work
💬 Test LLM	Send distractors to a live LLM, judge if it gets distracted

Each annotation follows the nvidia/CantTalkAboutThis schema, extended with: