Spaces:

lanczos
/

graphtestbed

Sleeping

App Files Files Community

graphtestbed / server /space /DEPLOY.md

Zhu Jiajun (jz28583)

Add agents/ harness integrations and HF Space scoring deployment

d094faf 23 days ago

preview code

raw

history blame contribute delete

2.78 kB

Deploying the GraphTestbed scoring server to HF Spaces

All commands assume HF_TOKEN is exported and has write scope on the lanczos namespace.

1. Seed the GT dataset repo

HF_TOKEN=$HF_TOKEN python server/space/push_gt.py \
    --repo lanczos/graphtestbed-gt \
    --gt-dir ~/graphtestbed-gt

This creates the private dataset repo if it doesn't exist and uploads each <task>.csv to gt/<task>.csv. Verify at:

https://huggingface.co/datasets/lanczos/graphtestbed-gt

2. Create the Space

huggingface-cli repo create graphtestbed --type space --space_sdk docker

Or in the web UI: New Space → name graphtestbed → SDK: Docker.

3. Set the Space secret

In Space Settings → Variables and secrets, add:

name	value
`HF_TOKEN`	same token (write scope on `lanczos/graphtestbed-gt`)

Optional overrides (set as variables, not secrets):

name	default	when to override
`GT_DATASET_REPO`	`lanczos/graphtestbed-gt`	running multiple Spaces against different GT
`GT_BACKUP_INTERVAL`	`60`	tighter durability vs. fewer commits
`GT_QUOTA`	`5`	bumping during a benchmark sprint

4. Push the code to the Space

# One-time
git remote add space https://huggingface.co/spaces/lanczos/graphtestbed

# Each deploy (HF prompts for credentials: user=lanczos, password=$HF_TOKEN)
./server/space/push_to_space.sh

The script overlays server/space/README.md at repo root on a temp branch and force-pushes to space/main (HF reads its frontmatter from root README). Your GitHub root README is untouched.

First build ~3 min (pandas + sklearn wheels). Subsequent ~30 s.

5. Smoke-test

curl -s https://lanczos-graphtestbed.hf.space/healthz | jq

Expect:

{
  "status": "ok",
  "tasks": ["arxiv-citation", "figraph", "ibm-aml", "ieee-fraud-detection"],
  "gt_present": ["figraph", "..."],
  "quota_per_day": 5,
  "uptime_unix": 1776633751
}

If gt_present is empty, the boot bootstrap couldn't read from the dataset repo — check the Space logs and verify HF_TOKEN has read scope on GT_DATASET_REPO.

6. Hand out the URL

export GRAPHTESTBED_API=https://lanczos-graphtestbed.hf.space
gtb submit figraph --file preds.csv --agent my-agent-v1

Reading the leaderboard back as a maintainer

huggingface-cli download lanczos/graphtestbed-gt \
    leaderboard.db \
    --repo-type dataset \
    --local-dir ./backup

sqlite3 backup/leaderboard.db \
    "SELECT task, agent, primary_metric, n_rows, submitted_at
     FROM submissions ORDER BY submitted_at DESC LIMIT 20"

The full per-submission CSV archive lives under submissions/<task>/<agent>-<run_id>.csv in the same dataset repo.