Spaces:
Sleeping
Sleeping
| title: GraphTestbed Scoring API | |
| emoji: π | |
| colorFrom: indigo | |
| colorTo: green | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: false | |
| # GraphTestbed Scoring API | |
| Public scoring server for the [GraphTestbed](https://github.com/zhuconv/GraphTestbed) | |
| benchmark. Anyone can `gtb submit <task> --file preds.csv --agent <name>` from | |
| anywhere; the scored entry lands on a single shared leaderboard. | |
| ## Endpoints | |
| | method | path | purpose | | |
| | --- | --- | --- | | |
| | POST | `/submit` | multipart `task=β¦&agent=β¦&file=preds.csv` β JSON with primary metric, secondary metrics, leaderboard rank, quota_remaining | | |
| | GET | `/leaderboard/<task>` | best-per-agent JSON, sorted by primary desc | | |
| | GET | `/healthz` | tasks list + which have GT loaded + quota | | |
| Full contract: [PROTOCOL.md](https://github.com/zhuconv/GraphTestbed/blob/main/PROTOCOL.md). | |
| ## Trust model | |
| Non-adversarial benchmark. The API enforces: | |
| - 5 submissions / day / IP / task | |
| - Schema check before scoring (malformed CSVs don't burn quota) | |
| - Score bucketing (round to 3 dp) | |
| - Audit trail in sqlite + per-submission CSV archive | |
| Test labels live only in the companion private dataset repo | |
| (`lanczos/graphtestbed-gt`) and never enter the Space's git history. | |
| ## Configuration (Space secrets) | |
| | name | required | default | notes | | |
| | --- | --- | --- | --- | | |
| | `HF_TOKEN` | yes | β | write scope on `GT_DATASET_REPO` | | |
| | `GT_DATASET_REPO` | no | `lanczos/graphtestbed-gt` | private dataset holding GT + leaderboard backups | | |
| | `GT_BACKUP_INTERVAL` | no | `60` | seconds between sqlite β dataset-repo pushes | | |
| | `GT_QUOTA` | no | `5` | submissions/day/IP/task | | |
| | `GT_BYPASS_KEY` | no | β | shared secret; clients sending it as `X-Bypass-Key` header skip quota and may pass `dry=1` to score without inserting | | |
| ## Persistence | |
| - On boot: `snapshot_download` pulls `gt/*.csv`, `leaderboard.db`, and any | |
| archived `submissions/**/*.csv` from the dataset repo into `/data`. | |
| - Every 60 s: if `SELECT COUNT(*) FROM submissions` grew, a daemon thread | |
| uses `sqlite3.Connection.backup()` to copy the DB atomically and | |
| `upload_file`s it back. New submission CSVs in `/data/submissions/` are | |
| pushed via `upload_folder` (content-hash diff β unchanged files skipped). | |
| - Worst-case loss on Space crash: 60 s of submissions. | |