graphtestbed / README.md
Zhu Jiajun (jz28583)
deploy: overlay server/space/{README,Dockerfile} at root
0425205
---
title: GraphTestbed Scoring API
emoji: πŸ“Š
colorFrom: indigo
colorTo: green
sdk: docker
app_port: 7860
pinned: false
---
# GraphTestbed Scoring API
Public scoring server for the [GraphTestbed](https://github.com/zhuconv/GraphTestbed)
benchmark. Anyone can `gtb submit <task> --file preds.csv --agent <name>` from
anywhere; the scored entry lands on a single shared leaderboard.
## Endpoints
| method | path | purpose |
| --- | --- | --- |
| POST | `/submit` | multipart `task=…&agent=…&file=preds.csv` β†’ JSON with primary metric, secondary metrics, leaderboard rank, quota_remaining |
| GET | `/leaderboard/<task>` | best-per-agent JSON, sorted by primary desc |
| GET | `/healthz` | tasks list + which have GT loaded + quota |
Full contract: [PROTOCOL.md](https://github.com/zhuconv/GraphTestbed/blob/main/PROTOCOL.md).
## Trust model
Non-adversarial benchmark. The API enforces:
- 5 submissions / day / IP / task
- Schema check before scoring (malformed CSVs don't burn quota)
- Score bucketing (round to 3 dp)
- Audit trail in sqlite + per-submission CSV archive
Test labels live only in the companion private dataset repo
(`lanczos/graphtestbed-gt`) and never enter the Space's git history.
## Configuration (Space secrets)
| name | required | default | notes |
| --- | --- | --- | --- |
| `HF_TOKEN` | yes | β€” | write scope on `GT_DATASET_REPO` |
| `GT_DATASET_REPO` | no | `lanczos/graphtestbed-gt` | private dataset holding GT + leaderboard backups |
| `GT_BACKUP_INTERVAL` | no | `60` | seconds between sqlite β†’ dataset-repo pushes |
| `GT_QUOTA` | no | `5` | submissions/day/IP/task |
| `GT_BYPASS_KEY` | no | β€” | shared secret; clients sending it as `X-Bypass-Key` header skip quota and may pass `dry=1` to score without inserting |
## Persistence
- On boot: `snapshot_download` pulls `gt/*.csv`, `leaderboard.db`, and any
archived `submissions/**/*.csv` from the dataset repo into `/data`.
- Every 60 s: if `SELECT COUNT(*) FROM submissions` grew, a daemon thread
uses `sqlite3.Connection.backup()` to copy the DB atomically and
`upload_file`s it back. New submission CSVs in `/data/submissions/` are
pushed via `upload_folder` (content-hash diff β€” unchanged files skipped).
- Worst-case loss on Space crash: 60 s of submissions.