Spaces:

nvidia
/

simready-validator

Sleeping

App Files Files Community

simready-validator / tools /hf_space /README.md

loginowskid

Sync from simready-oem-library-pm@99a90321

076e185 verified 7 days ago

preview code

raw

history blame contribute delete

8.1 kB

	# SimReady Validator — HuggingFace Space (Phase-3 Spike)

	This directory scaffolds an HF Space that runs the bundled
	[`simready-report`](../validation/plugins/simready-report/) validator
	against any HF dataset, then opens a verdict PR back on the dataset.

	It is the phase-3 prove-it step described in [PRD §3](../../PRD.md):
	move validation execution to where the dataset already lives, so we stop
	paying to copy 20 GiB of customer assets onto NVIDIA-controlled
	infrastructure on every run.

	\| \| DGXC runner today \| HF Space (this dir) \|
	\|---\|---\|---\|
	\| Asset transfer \| 10–20 GiB per submission onto a 49 GiB PVC \| None — `huggingface_hub.snapshot_download` reads from HF storage directly \|
	\| Cost model \| NVIDIA pays for the runner \| Customer pays for their Space's hardware hours \|
	\| Concurrency \| Single runner, jobs serialized \| One Space per dataset → scales linearly \|
	\| Where verdicts land \| `dashboard/data/status.json` in this repo \| `validation/results.json` in the dataset, via PR \|
	\| Trigger \| GitHub Actions `workflow_dispatch` \| Gradio UI (spike) → HF Hub webhook (next) \|

	The Space is internal pilot scope: the HF_TOKEN that opens the verdict
	PR is the Space's own secret, not the requester's. A customer-facing
	end-state would either (a) deploy one Space per partner under their org,
	or (b) keep a single multi-tenant Space and have customers pass their own
	token explicitly.

	---

	## What's here

	\| File \| Purpose \|
	\|---\|---\|
	\| `Dockerfile` \| Docker SDK image: pip-installs the validator runtime, clones `NVIDIA/simready-foundation` for `SIMREADY_FOUNDATIONS_PATH`, bakes in the in-repo `tools/validation/` skill \|
	\| `requirements.txt` \| Python deps. Pinned to match `tools/runner/install-simready-sdk.sh` so verdicts are byte-for-byte reproducible across environments \|
	\| `app.py` \| Gradio UI. Single form: dataset name + profile + version + open-PR checkbox \|
	\| `runner.py` \| Orchestration. `run(dataset, profile, version, open_pr) → RunResult` — also the entry point a future webhook handler will call \|

	The validator engine itself is unchanged — the Space invokes the same
	`tools/validation/plugins/simready-report/skills/simready-report/validate.py`
	that Windows users run locally and that the DGXC runner runs in CI.
	That's the whole point of phase 3: the verdict logic is portable; only
	the trigger surface changes.

	---

	## Hardware tier choice

	The validator's heavy work is USD parsing + composition-arc traversal,
	which is CPU-bound. **GPU is only required if a profile re-execs under
	Kit (Isaac Sim)**, which the Space's `--no-use-kit` flag explicitly
	disables. The validator's P2 patch drops the
	`physxschema_unavailable` / `omnipbr_unresolved` issues that the
	no-Kit path produces, so PhysX-bearing profiles still report a clean
	verdict for everything that can be checked without Kit.

	\| Tier \| $/hr \| Verdict \|
	\|---\|---\|---\|
	\| `cpu-basic` (2 vCPU, 16 GB) \| $0.03 \| Marginal — small datasets OK, 50+ asset bundles will time out \|
	\| `cpu-upgrade` (8 vCPU, 32 GB) \| $0.05 \| Recommended for the spike. Comfortable headroom; the validator's parallel worker pool actually scales here \|
	\| `t4-small` (1 × T4, 4 vCPU, 15 GB) \| $0.40 \| Only needed once we add Kit; overkill for `--no-use-kit` \|
	\| `a10g-small` (1 × A10G, 4 vCPU, 15 GB) \| $1.05 \| Future state: enables Kit-rooted PhysX/MDL rules (currently filtered out by the P2 patch) \|

	Set the tier in the Space's Settings → Hardware page (or in
	`README.md` frontmatter when the Space repo is created — see Deploy).

	---

	## Deploy

	This dance is captured as a Claude Code skill at
	[`skills/deploy-hf-space/SKILL.md`](./skills/deploy-hf-space/SKILL.md).
	Future operators can run `/deploy-hf-space [<slug>]` instead of
	following this README by hand. The README below is the human-readable
	mirror.

	The Space is currently live at
	[`nvidia/simready-validator`](https://huggingface.co/spaces/nvidia/simready-validator).
	To re-stand it up from scratch:

	### 1. Create the Space `[BROWSER]`

	1. Sign in at https://huggingface.co with an account that has write
	access to wherever the Space will live (an NVIDIA org for the
	internal pilot).
	2. New → Space.
	3. Name: `simready-validator` (or any name).
	4. SDK: Docker.
	5. Hardware: CPU upgrade (~$0.05/hr) for the spike.
	6. Visibility: Private while internal-pilot.

	### 2. Set the HF_TOKEN secret `[BROWSER]`

	The Space needs a write-scoped token to open the verdict PR on customer
	datasets.

	1. https://huggingface.co/settings/tokens → New token → Write scope.
	2. Space → Settings → Variables and secrets → New secret.
	3. Name: `HF_TOKEN`. Value: paste the token.

	Tokens are not exposed in the build log. The `runner.py` code reads it
	via `os.environ["HF_TOKEN"]` (with `HUGGING_FACE_HUB_TOKEN` as a
	fallback for compatibility with the HF SDK's standard env name).

	### 3. Push the code `[LOCAL]`

	The Space is a git repo of its own. From this checkout:

	```bash
	# Replace <space-name> with the Space you created (e.g. nvidia/simready-validator)
	hf auth login # one-time
	git clone https://huggingface.co/spaces/<space-name> /tmp/space
	cp -r tools/hf_space/* /tmp/space/
	# Important: the Dockerfile COPYs tools/validation/ from the repo root,
	# so we have to vendor that subtree into the Space repo too.
	mkdir -p /tmp/space/tools
	cp -r tools/validation /tmp/space/tools/
	cd /tmp/space
	git add .
	git commit -m "Initial Space scaffold from simready-oem-library-pm@main"
	git push
	```

	The Space will start building automatically. First build takes ~5 min
	(usd-core + omniverse-asset-validator wheels + the foundation clone).
	Subsequent builds reuse Docker layer cache and finish in ~1 min if only
	`app.py` or `runner.py` changed.

	### 4. Smoke-test `[BROWSER]`

	Open the Space's URL. Enter a known-good dataset (the foundation
	clone's bundled examples work well — point at something small like a
	single-asset dataset first), pick Robot-Body-Runnable, leave
	Open PR unchecked, click Validate. Watch the log stream.

	Expected output ends with:

	```
	PASS: 1/1 assets passed
	```

	…and a downloadable `index.html` report.

	If the verdict makes sense, re-run with Open PR checked against a
	dataset you have write access to. A new PR appears on the dataset under
	`https://huggingface.co/datasets/<dataset>/discussions` with the
	verdict body + the `validation/` subtree.

	---

	## What this spike intentionally does NOT do

	To stay scoped to "prove the engine works on HF":

	- No HF Hub webhook. Triggering is Gradio-only. Phase 3.2 wires
	`https://<space>/api/run` to `dataset.commit.push` events.
	- No status callback into this repo. The DGXC dashboard's
	`hf-watch.yml` already polls dataset commits — once the Space lands
	verdicts as `validation/results.json` on the dataset, the existing
	watcher picks them up. No new integration needed.
	- No Kit re-exec. Profiles requiring PhysX/MDL rules currently
	report partial verdicts (the P2 patch drops env-blocked rules). A
	future iteration with an `a10g-small` tier + Isaac Sim wheels in the
	Dockerfile unlocks these.
	- No multi-tenant token isolation. The Space's own `HF_TOKEN` opens
	every PR. Fine for internal pilot; needs rework before exposing the
	Space outside NVIDIA.

	These are tracked in [PRD §7 — roadmap](../../PRD.md).

	---

	## Cutover criteria (when do we retire the DGXC runner?)

	Stop using `hf-validate.yml` once all three are true:

	1. The Space verdict matches the DGXC verdict on the same dataset for
	the top three onboarded clients (imagineio kitchens, plus two TBD).
	Byte-for-byte equality on `results.json` is the bar.
	2. The HF Hub webhook handler (phase 3.2) is in place and a customer
	can open a dataset PR and see a verdict comment without anyone at
	NVIDIA pressing a button.
	3. The PRD §7 roadmap items 3.3 (auto-merge on pass) and 3.4 (block-on-fail
	gate at the GitHub side) are wired through, so the GitHub coordinator
	only deals with state — never validation.

	Until then both paths coexist; the DGXC runner is the source of truth.