loginowskid's picture
Sync from simready-oem-library-pm@99a90321
076e185 verified
# SimReady Validator β€” HuggingFace Space (Phase-3 Spike)
This directory scaffolds an HF Space that runs the bundled
[`simready-report`](../validation/plugins/simready-report/) validator
against any HF dataset, then opens a verdict PR back on the dataset.
It is the **phase-3 prove-it step** described in [PRD Β§3](../../PRD.md):
move validation execution to where the dataset already lives, so we stop
paying to copy 20 GiB of customer assets onto NVIDIA-controlled
infrastructure on every run.
| | DGXC runner today | HF Space (this dir) |
|---|---|---|
| Asset transfer | 10–20 GiB per submission onto a 49 GiB PVC | None β€” `huggingface_hub.snapshot_download` reads from HF storage directly |
| Cost model | NVIDIA pays for the runner | Customer pays for their Space's hardware hours |
| Concurrency | Single runner, jobs serialized | One Space per dataset β†’ scales linearly |
| Where verdicts land | `dashboard/data/status.json` in this repo | `validation/results.json` in the dataset, via PR |
| Trigger | GitHub Actions `workflow_dispatch` | Gradio UI (spike) β†’ HF Hub webhook (next) |
The Space is **internal pilot scope**: the HF_TOKEN that opens the verdict
PR is the Space's own secret, not the requester's. A customer-facing
end-state would either (a) deploy one Space per partner under their org,
or (b) keep a single multi-tenant Space and have customers pass their own
token explicitly.
---
## What's here
| File | Purpose |
|---|---|
| `Dockerfile` | Docker SDK image: pip-installs the validator runtime, clones `NVIDIA/simready-foundation` for `SIMREADY_FOUNDATIONS_PATH`, bakes in the in-repo `tools/validation/` skill |
| `requirements.txt` | Python deps. Pinned to match `tools/runner/install-simready-sdk.sh` so verdicts are byte-for-byte reproducible across environments |
| `app.py` | Gradio UI. Single form: dataset name + profile + version + open-PR checkbox |
| `runner.py` | Orchestration. `run(dataset, profile, version, open_pr) β†’ RunResult` β€” also the entry point a future webhook handler will call |
The validator engine itself is **unchanged** β€” the Space invokes the same
`tools/validation/plugins/simready-report/skills/simready-report/validate.py`
that Windows users run locally and that the DGXC runner runs in CI.
That's the whole point of phase 3: the verdict logic is portable; only
the trigger surface changes.
---
## Hardware tier choice
The validator's heavy work is USD parsing + composition-arc traversal,
which is CPU-bound. **GPU is only required if a profile re-execs under
Kit (Isaac Sim)**, which the Space's `--no-use-kit` flag explicitly
disables. The validator's P2 patch drops the
`physxschema_unavailable` / `omnipbr_unresolved` issues that the
no-Kit path produces, so PhysX-bearing profiles still report a clean
verdict for everything that can be checked without Kit.
| Tier | $/hr | Verdict |
|---|---|---|
| `cpu-basic` (2 vCPU, 16 GB) | $0.03 | Marginal β€” small datasets OK, 50+ asset bundles will time out |
| **`cpu-upgrade`** (8 vCPU, 32 GB) | **$0.05** | **Recommended for the spike.** Comfortable headroom; the validator's parallel worker pool actually scales here |
| `t4-small` (1 Γ— T4, 4 vCPU, 15 GB) | $0.40 | Only needed once we add Kit; overkill for `--no-use-kit` |
| `a10g-small` (1 Γ— A10G, 4 vCPU, 15 GB) | $1.05 | Future state: enables Kit-rooted PhysX/MDL rules (currently filtered out by the P2 patch) |
Set the tier in the Space's **Settings β†’ Hardware** page (or in
`README.md` frontmatter when the Space repo is created β€” see Deploy).
---
## Deploy
This dance is captured as a Claude Code skill at
[`skills/deploy-hf-space/SKILL.md`](./skills/deploy-hf-space/SKILL.md).
Future operators can run `/deploy-hf-space [<slug>]` instead of
following this README by hand. The README below is the human-readable
mirror.
The Space is currently live at
[`nvidia/simready-validator`](https://huggingface.co/spaces/nvidia/simready-validator).
To re-stand it up from scratch:
### 1. Create the Space `[BROWSER]`
1. Sign in at https://huggingface.co with an account that has write
access to wherever the Space will live (an NVIDIA org for the
internal pilot).
2. New β†’ Space.
3. Name: `simready-validator` (or any name).
4. SDK: **Docker**.
5. Hardware: **CPU upgrade** (~$0.05/hr) for the spike.
6. Visibility: **Private** while internal-pilot.
### 2. Set the HF_TOKEN secret `[BROWSER]`
The Space needs a write-scoped token to open the verdict PR on customer
datasets.
1. https://huggingface.co/settings/tokens β†’ New token β†’ **Write** scope.
2. Space β†’ Settings β†’ Variables and secrets β†’ New secret.
3. Name: `HF_TOKEN`. Value: paste the token.
Tokens are not exposed in the build log. The `runner.py` code reads it
via `os.environ["HF_TOKEN"]` (with `HUGGING_FACE_HUB_TOKEN` as a
fallback for compatibility with the HF SDK's standard env name).
### 3. Push the code `[LOCAL]`
The Space is a git repo of its own. From this checkout:
```bash
# Replace <space-name> with the Space you created (e.g. nvidia/simready-validator)
hf auth login # one-time
git clone https://huggingface.co/spaces/<space-name> /tmp/space
cp -r tools/hf_space/* /tmp/space/
# Important: the Dockerfile COPYs tools/validation/ from the repo root,
# so we have to vendor that subtree into the Space repo too.
mkdir -p /tmp/space/tools
cp -r tools/validation /tmp/space/tools/
cd /tmp/space
git add .
git commit -m "Initial Space scaffold from simready-oem-library-pm@main"
git push
```
The Space will start building automatically. First build takes ~5 min
(usd-core + omniverse-asset-validator wheels + the foundation clone).
Subsequent builds reuse Docker layer cache and finish in ~1 min if only
`app.py` or `runner.py` changed.
### 4. Smoke-test `[BROWSER]`
Open the Space's URL. Enter a known-good dataset (the foundation
clone's bundled examples work well β€” point at something small like a
single-asset dataset first), pick **Robot-Body-Runnable**, leave
**Open PR** unchecked, click **Validate**. Watch the log stream.
Expected output ends with:
```
PASS: 1/1 assets passed
```
…and a downloadable `index.html` report.
If the verdict makes sense, re-run with **Open PR** checked against a
dataset you have write access to. A new PR appears on the dataset under
`https://huggingface.co/datasets/<dataset>/discussions` with the
verdict body + the `validation/` subtree.
---
## What this spike intentionally does NOT do
To stay scoped to "prove the engine works on HF":
- **No HF Hub webhook.** Triggering is Gradio-only. Phase 3.2 wires
`https://<space>/api/run` to `dataset.commit.push` events.
- **No status callback into this repo.** The DGXC dashboard's
`hf-watch.yml` already polls dataset commits β€” once the Space lands
verdicts as `validation/results.json` on the dataset, the existing
watcher picks them up. No new integration needed.
- **No Kit re-exec.** Profiles requiring PhysX/MDL rules currently
report partial verdicts (the P2 patch drops env-blocked rules). A
future iteration with an `a10g-small` tier + Isaac Sim wheels in the
Dockerfile unlocks these.
- **No multi-tenant token isolation.** The Space's own `HF_TOKEN` opens
every PR. Fine for internal pilot; needs rework before exposing the
Space outside NVIDIA.
These are tracked in [PRD Β§7 β€” roadmap](../../PRD.md).
---
## Cutover criteria (when do we retire the DGXC runner?)
Stop using `hf-validate.yml` once **all three** are true:
1. The Space verdict matches the DGXC verdict on the same dataset for
the top three onboarded clients (imagineio kitchens, plus two TBD).
Byte-for-byte equality on `results.json` is the bar.
2. The HF Hub webhook handler (phase 3.2) is in place and a customer
can open a dataset PR and see a verdict comment without anyone at
NVIDIA pressing a button.
3. The PRD Β§7 roadmap items 3.3 (auto-merge on pass) and 3.4 (block-on-fail
gate at the GitHub side) are wired through, so the GitHub coordinator
only deals with state β€” never validation.
Until then both paths coexist; the DGXC runner is the source of truth.