lanczos's picture
deploy: labeling server
871ff87 verified
metadata
title: Aesthetic Annotators
emoji: 🎨
colorFrom: purple
colorTo: pink
sdk: docker
app_port: 7860
pinned: false

Aesthetic Annotators

Public-URL labeling server for the AestheticMCQ dataset. Any visitor is auto-issued an anonymous annotator id on first hit; each session labels up to AAMCQ_PER_ANNOTATOR_CAP items (default 20) pulled breadth-first from the pool so every item receives one label before any receives a second.

Configuration

Space secrets:

name required default notes
HF_TOKEN yes write scope on the companion dataset repo
AAMCQ_DATASET_REPO no lanczos/aesthetic-annotators source of images + mcq + label backups
AAMCQ_PER_ANNOTATOR_CAP no 20 items per session before "all done"
AAMCQ_LABELS_PER_ITEM no 3 target labels per item
AAMCQ_BACKUP_INTERVAL no 60 SQLite → dataset repo push interval (seconds)

Data flow

  1. On boot, the Space pulls images/*.png, mcq_unlabeled.jsonl, and any prior labels/annotations.sqlite from the dataset repo.
  2. Annotators land on the root URL → JS calls POST /api/register → server mints a fresh anon_* id + token, cached in localStorage.
  3. /api/task hands out the least-labeled item the annotator hasn't seen.
  4. Every AAMCQ_BACKUP_INTERVAL seconds the server pushes the SQLite back to labels/annotations.sqlite in the dataset repo, so Space restarts/sleeps lose at most one backup interval's worth of labels.