Spaces:
Sleeping
Sleeping
metadata
title: GraphTestbed Scoring API
emoji: π
colorFrom: indigo
colorTo: green
sdk: docker
app_port: 7860
pinned: false
GraphTestbed Scoring API
Public scoring server for the GraphTestbed
benchmark. Anyone can gtb submit <task> --file preds.csv --agent <name> from
anywhere; the scored entry lands on a single shared leaderboard.
Endpoints
| method | path | purpose |
|---|---|---|
| POST | /submit |
multipart task=β¦&agent=β¦&file=preds.csv β JSON with primary metric, secondary metrics, leaderboard rank, quota_remaining |
| GET | /leaderboard/<task> |
best-per-agent JSON, sorted by primary desc |
| GET | /healthz |
tasks list + which have GT loaded + quota |
Full contract: PROTOCOL.md.
Trust model
Non-adversarial benchmark. The API enforces:
- 5 submissions / day / IP / task
- Schema check before scoring (malformed CSVs don't burn quota)
- Score bucketing (round to 3 dp)
- Audit trail in sqlite + per-submission CSV archive
Test labels live only in the companion private dataset repo
(lanczos/graphtestbed-gt) and never enter the Space's git history.
Configuration (Space secrets)
| name | required | default | notes |
|---|---|---|---|
HF_TOKEN |
yes | β | write scope on GT_DATASET_REPO |
GT_DATASET_REPO |
no | lanczos/graphtestbed-gt |
private dataset holding GT + leaderboard backups |
GT_BACKUP_INTERVAL |
no | 60 |
seconds between sqlite β dataset-repo pushes |
GT_QUOTA |
no | 5 |
submissions/day/IP/task |
GT_BYPASS_KEY |
no | β | shared secret; clients sending it as X-Bypass-Key header skip quota and may pass dry=1 to score without inserting |
Persistence
- On boot:
snapshot_downloadpullsgt/*.csv,leaderboard.db, and any archivedsubmissions/**/*.csvfrom the dataset repo into/data. - Every 60 s: if
SELECT COUNT(*) FROM submissionsgrew, a daemon thread usessqlite3.Connection.backup()to copy the DB atomically andupload_files it back. New submission CSVs in/data/submissions/are pushed viaupload_folder(content-hash diff β unchanged files skipped). - Worst-case loss on Space crash: 60 s of submissions.