Spaces:

lanczos
/

graphtestbed

Sleeping

App Files Files Community

graphtestbed / server /space /DEPLOY.md

Zhu Jiajun (jz28583)

Add agents/ harness integrations and HF Space scoring deployment

d094faf 23 days ago

preview code

raw

history blame contribute delete

2.78 kB

	# Deploying the GraphTestbed scoring server to HF Spaces

	All commands assume `HF_TOKEN` is exported and has write scope on the
	`lanczos` namespace.

	## 1. Seed the GT dataset repo

	```bash
	HF_TOKEN=$HF_TOKEN python server/space/push_gt.py \
	--repo lanczos/graphtestbed-gt \
	--gt-dir ~/graphtestbed-gt
	```

	This creates the private dataset repo if it doesn't exist and uploads
	each `<task>.csv` to `gt/<task>.csv`. Verify at:

	<https://huggingface.co/datasets/lanczos/graphtestbed-gt>

	## 2. Create the Space

	```bash
	huggingface-cli repo create graphtestbed --type space --space_sdk docker
	```

	Or in the web UI: New Space → name `graphtestbed` → SDK: Docker.

	## 3. Set the Space secret

	In Space Settings → Variables and secrets, add:

	\| name \| value \|
	\| --- \| --- \|
	\| `HF_TOKEN` \| same token (write scope on `lanczos/graphtestbed-gt`) \|

	Optional overrides (set as variables, not secrets):

	\| name \| default \| when to override \|
	\| --- \| --- \| --- \|
	\| `GT_DATASET_REPO` \| `lanczos/graphtestbed-gt` \| running multiple Spaces against different GT \|
	\| `GT_BACKUP_INTERVAL` \| `60` \| tighter durability vs. fewer commits \|
	\| `GT_QUOTA` \| `5` \| bumping during a benchmark sprint \|

	## 4. Push the code to the Space

	```bash
	# One-time
	git remote add space https://huggingface.co/spaces/lanczos/graphtestbed

	# Each deploy (HF prompts for credentials: user=lanczos, password=$HF_TOKEN)
	./server/space/push_to_space.sh
	```

	The script overlays `server/space/README.md` at repo root on a temp branch
	and force-pushes to `space/main` (HF reads its frontmatter from root
	README). Your GitHub root README is untouched.

	First build ~3 min (pandas + sklearn wheels). Subsequent ~30 s.

	## 5. Smoke-test

	```bash
	curl -s https://lanczos-graphtestbed.hf.space/healthz \| jq
	```

	Expect:
	```json
	{
	"status": "ok",
	"tasks": ["arxiv-citation", "figraph", "ibm-aml", "ieee-fraud-detection"],
	"gt_present": ["figraph", "..."],
	"quota_per_day": 5,
	"uptime_unix": 1776633751
	}
	```

	If `gt_present` is empty, the boot bootstrap couldn't read from the dataset
	repo — check the Space logs and verify `HF_TOKEN` has read scope on
	`GT_DATASET_REPO`.

	## 6. Hand out the URL

	```
	export GRAPHTESTBED_API=https://lanczos-graphtestbed.hf.space
	gtb submit figraph --file preds.csv --agent my-agent-v1
	```

	## Reading the leaderboard back as a maintainer

	```bash
	huggingface-cli download lanczos/graphtestbed-gt \
	leaderboard.db \
	--repo-type dataset \
	--local-dir ./backup

	sqlite3 backup/leaderboard.db \
	"SELECT task, agent, primary_metric, n_rows, submitted_at
	FROM submissions ORDER BY submitted_at DESC LIMIT 20"
	```

	The full per-submission CSV archive lives under `submissions/<task>/<agent>-<run_id>.csv`
	in the same dataset repo.