loginowskid's picture
Sync from simready-oem-library-pm@99a90321
076e185 verified

SimReady Validator β€” HuggingFace Space (Phase-3 Spike)

This directory scaffolds an HF Space that runs the bundled simready-report validator against any HF dataset, then opens a verdict PR back on the dataset.

It is the phase-3 prove-it step described in PRD Β§3: move validation execution to where the dataset already lives, so we stop paying to copy 20 GiB of customer assets onto NVIDIA-controlled infrastructure on every run.

DGXC runner today HF Space (this dir)
Asset transfer 10–20 GiB per submission onto a 49 GiB PVC None β€” huggingface_hub.snapshot_download reads from HF storage directly
Cost model NVIDIA pays for the runner Customer pays for their Space's hardware hours
Concurrency Single runner, jobs serialized One Space per dataset β†’ scales linearly
Where verdicts land dashboard/data/status.json in this repo validation/results.json in the dataset, via PR
Trigger GitHub Actions workflow_dispatch Gradio UI (spike) β†’ HF Hub webhook (next)

The Space is internal pilot scope: the HF_TOKEN that opens the verdict PR is the Space's own secret, not the requester's. A customer-facing end-state would either (a) deploy one Space per partner under their org, or (b) keep a single multi-tenant Space and have customers pass their own token explicitly.


What's here

File Purpose
Dockerfile Docker SDK image: pip-installs the validator runtime, clones NVIDIA/simready-foundation for SIMREADY_FOUNDATIONS_PATH, bakes in the in-repo tools/validation/ skill
requirements.txt Python deps. Pinned to match tools/runner/install-simready-sdk.sh so verdicts are byte-for-byte reproducible across environments
app.py Gradio UI. Single form: dataset name + profile + version + open-PR checkbox
runner.py Orchestration. run(dataset, profile, version, open_pr) β†’ RunResult β€” also the entry point a future webhook handler will call

The validator engine itself is unchanged β€” the Space invokes the same tools/validation/plugins/simready-report/skills/simready-report/validate.py that Windows users run locally and that the DGXC runner runs in CI. That's the whole point of phase 3: the verdict logic is portable; only the trigger surface changes.


Hardware tier choice

The validator's heavy work is USD parsing + composition-arc traversal, which is CPU-bound. GPU is only required if a profile re-execs under Kit (Isaac Sim), which the Space's --no-use-kit flag explicitly disables. The validator's P2 patch drops the physxschema_unavailable / omnipbr_unresolved issues that the no-Kit path produces, so PhysX-bearing profiles still report a clean verdict for everything that can be checked without Kit.

Tier $/hr Verdict
cpu-basic (2 vCPU, 16 GB) $0.03 Marginal β€” small datasets OK, 50+ asset bundles will time out
cpu-upgrade (8 vCPU, 32 GB) $0.05 Recommended for the spike. Comfortable headroom; the validator's parallel worker pool actually scales here
t4-small (1 Γ— T4, 4 vCPU, 15 GB) $0.40 Only needed once we add Kit; overkill for --no-use-kit
a10g-small (1 Γ— A10G, 4 vCPU, 15 GB) $1.05 Future state: enables Kit-rooted PhysX/MDL rules (currently filtered out by the P2 patch)

Set the tier in the Space's Settings β†’ Hardware page (or in README.md frontmatter when the Space repo is created β€” see Deploy).


Deploy

This dance is captured as a Claude Code skill at skills/deploy-hf-space/SKILL.md. Future operators can run /deploy-hf-space [<slug>] instead of following this README by hand. The README below is the human-readable mirror.

The Space is currently live at nvidia/simready-validator. To re-stand it up from scratch:

1. Create the Space [BROWSER]

  1. Sign in at https://huggingface.co with an account that has write access to wherever the Space will live (an NVIDIA org for the internal pilot).
  2. New β†’ Space.
  3. Name: simready-validator (or any name).
  4. SDK: Docker.
  5. Hardware: CPU upgrade (~$0.05/hr) for the spike.
  6. Visibility: Private while internal-pilot.

2. Set the HF_TOKEN secret [BROWSER]

The Space needs a write-scoped token to open the verdict PR on customer datasets.

  1. https://huggingface.co/settings/tokens β†’ New token β†’ Write scope.
  2. Space β†’ Settings β†’ Variables and secrets β†’ New secret.
  3. Name: HF_TOKEN. Value: paste the token.

Tokens are not exposed in the build log. The runner.py code reads it via os.environ["HF_TOKEN"] (with HUGGING_FACE_HUB_TOKEN as a fallback for compatibility with the HF SDK's standard env name).

3. Push the code [LOCAL]

The Space is a git repo of its own. From this checkout:

# Replace <space-name> with the Space you created (e.g. nvidia/simready-validator)
hf auth login   # one-time
git clone https://huggingface.co/spaces/<space-name> /tmp/space
cp -r tools/hf_space/* /tmp/space/
# Important: the Dockerfile COPYs tools/validation/ from the repo root,
# so we have to vendor that subtree into the Space repo too.
mkdir -p /tmp/space/tools
cp -r tools/validation /tmp/space/tools/
cd /tmp/space
git add .
git commit -m "Initial Space scaffold from simready-oem-library-pm@main"
git push

The Space will start building automatically. First build takes ~5 min (usd-core + omniverse-asset-validator wheels + the foundation clone). Subsequent builds reuse Docker layer cache and finish in ~1 min if only app.py or runner.py changed.

4. Smoke-test [BROWSER]

Open the Space's URL. Enter a known-good dataset (the foundation clone's bundled examples work well β€” point at something small like a single-asset dataset first), pick Robot-Body-Runnable, leave Open PR unchecked, click Validate. Watch the log stream.

Expected output ends with:

  PASS: 1/1 assets passed

…and a downloadable index.html report.

If the verdict makes sense, re-run with Open PR checked against a dataset you have write access to. A new PR appears on the dataset under https://huggingface.co/datasets/<dataset>/discussions with the verdict body + the validation/ subtree.


What this spike intentionally does NOT do

To stay scoped to "prove the engine works on HF":

  • No HF Hub webhook. Triggering is Gradio-only. Phase 3.2 wires https://<space>/api/run to dataset.commit.push events.
  • No status callback into this repo. The DGXC dashboard's hf-watch.yml already polls dataset commits β€” once the Space lands verdicts as validation/results.json on the dataset, the existing watcher picks them up. No new integration needed.
  • No Kit re-exec. Profiles requiring PhysX/MDL rules currently report partial verdicts (the P2 patch drops env-blocked rules). A future iteration with an a10g-small tier + Isaac Sim wheels in the Dockerfile unlocks these.
  • No multi-tenant token isolation. The Space's own HF_TOKEN opens every PR. Fine for internal pilot; needs rework before exposing the Space outside NVIDIA.

These are tracked in PRD Β§7 β€” roadmap.


Cutover criteria (when do we retire the DGXC runner?)

Stop using hf-validate.yml once all three are true:

  1. The Space verdict matches the DGXC verdict on the same dataset for the top three onboarded clients (imagineio kitchens, plus two TBD). Byte-for-byte equality on results.json is the bar.
  2. The HF Hub webhook handler (phase 3.2) is in place and a customer can open a dataset PR and see a verdict comment without anyone at NVIDIA pressing a button.
  3. The PRD Β§7 roadmap items 3.3 (auto-merge on pass) and 3.4 (block-on-fail gate at the GitHub side) are wired through, so the GitHub coordinator only deals with state β€” never validation.

Until then both paths coexist; the DGXC runner is the source of truth.