loginowskid commited on
Commit
7c81ec6
·
verified ·
1 Parent(s): 6b5095a

Restore HF Space README (YAML frontmatter)

Browse files
Files changed (1) hide show
  1. README.md +40 -185
README.md CHANGED
@@ -1,185 +1,40 @@
1
- # SimReady Validator — HuggingFace Space (Phase-3 Spike)
2
-
3
- This directory scaffolds an HF Space that runs the bundled
4
- [`simready-report`](../validation/plugins/simready-report/) validator
5
- against any HF dataset, then opens a verdict PR back on the dataset.
6
-
7
- It is the **phase-3 prove-it step** described in [PRD §3](../../PRD.md):
8
- move validation execution to where the dataset already lives, so we stop
9
- paying to copy 20 GiB of customer assets onto NVIDIA-controlled
10
- infrastructure on every run.
11
-
12
- | | DGXC runner today | HF Space (this dir) |
13
- |---|---|---|
14
- | Asset transfer | 10–20 GiB per submission onto a 49 GiB PVC | None — `huggingface_hub.snapshot_download` reads from HF storage directly |
15
- | Cost model | NVIDIA pays for the runner | Customer pays for their Space's hardware hours |
16
- | Concurrency | Single runner, jobs serialized | One Space per dataset → scales linearly |
17
- | Where verdicts land | `docs/dashboard/data/status.json` in this repo | `validation/results.json` in the dataset, via PR |
18
- | Trigger | GitHub Actions `workflow_dispatch` | Gradio UI (spike) → HF Hub webhook (next) |
19
-
20
- The Space is **internal pilot scope**: the HF_TOKEN that opens the verdict
21
- PR is the Space's own secret, not the requester's. A customer-facing
22
- end-state would either (a) deploy one Space per partner under their org,
23
- or (b) keep a single multi-tenant Space and have customers pass their own
24
- token explicitly.
25
-
26
- ---
27
-
28
- ## What's here
29
-
30
- | File | Purpose |
31
- |---|---|
32
- | `Dockerfile` | Docker SDK image: pip-installs the validator runtime, clones `NVIDIA/simready-foundation` for `SIMREADY_FOUNDATIONS_PATH`, bakes in the in-repo `tools/validation/` skill |
33
- | `requirements.txt` | Python deps. Pinned to match `tools/runner/install-simready-sdk.sh` so verdicts are byte-for-byte reproducible across environments |
34
- | `app.py` | Gradio UI. Single form: dataset name + profile + version + open-PR checkbox |
35
- | `runner.py` | Orchestration. `run(dataset, profile, version, open_pr) → RunResult` — also the entry point a future webhook handler will call |
36
-
37
- The validator engine itself is **unchanged** — the Space invokes the same
38
- `tools/validation/plugins/simready-report/skills/simready-report/validate.py`
39
- that Windows users run locally and that the DGXC runner runs in CI.
40
- That's the whole point of phase 3: the verdict logic is portable; only
41
- the trigger surface changes.
42
-
43
- ---
44
-
45
- ## Hardware tier choice
46
-
47
- The validator's heavy work is USD parsing + composition-arc traversal,
48
- which is CPU-bound. **GPU is only required if a profile re-execs under
49
- Kit (Isaac Sim)**, which the Space's `--no-use-kit` flag explicitly
50
- disables. The validator's P2 patch drops the
51
- `physxschema_unavailable` / `omnipbr_unresolved` issues that the
52
- no-Kit path produces, so PhysX-bearing profiles still report a clean
53
- verdict for everything that can be checked without Kit.
54
-
55
- | Tier | $/hr | Verdict |
56
- |---|---|---|
57
- | `cpu-basic` (2 vCPU, 16 GB) | $0.03 | Marginal — small datasets OK, 50+ asset bundles will time out |
58
- | **`cpu-upgrade`** (8 vCPU, 32 GB) | **$0.05** | **Recommended for the spike.** Comfortable headroom; the validator's parallel worker pool actually scales here |
59
- | `t4-small` (1 × T4, 4 vCPU, 15 GB) | $0.40 | Only needed once we add Kit; overkill for `--no-use-kit` |
60
- | `a10g-small` (1 × A10G, 4 vCPU, 15 GB) | $1.05 | Future state: enables Kit-rooted PhysX/MDL rules (currently filtered out by the P2 patch) |
61
-
62
- Set the tier in the Space's **Settings → Hardware** page (or in
63
- `README.md` frontmatter when the Space repo is created — see Deploy).
64
-
65
- ---
66
-
67
- ## Deploy
68
-
69
- This dance is captured as a Claude Code skill at
70
- [`skills/deploy-hf-space/SKILL.md`](./skills/deploy-hf-space/SKILL.md).
71
- Future operators can run `/deploy-hf-space [<slug>]` instead of
72
- following this README by hand. The README below is the human-readable
73
- mirror.
74
-
75
- The Space is currently live at
76
- [`nvidia/simready-validator`](https://huggingface.co/spaces/nvidia/simready-validator).
77
- To re-stand it up from scratch:
78
-
79
- ### 1. Create the Space `[BROWSER]`
80
-
81
- 1. Sign in at https://huggingface.co with an account that has write
82
- access to wherever the Space will live (an NVIDIA org for the
83
- internal pilot).
84
- 2. New → Space.
85
- 3. Name: `simready-validator` (or any name).
86
- 4. SDK: **Docker**.
87
- 5. Hardware: **CPU upgrade** (~$0.05/hr) for the spike.
88
- 6. Visibility: **Private** while internal-pilot.
89
-
90
- ### 2. Set the HF_TOKEN secret `[BROWSER]`
91
-
92
- The Space needs a write-scoped token to open the verdict PR on customer
93
- datasets.
94
-
95
- 1. https://huggingface.co/settings/tokens → New token → **Write** scope.
96
- 2. Space → Settings → Variables and secrets → New secret.
97
- 3. Name: `HF_TOKEN`. Value: paste the token.
98
-
99
- Tokens are not exposed in the build log. The `runner.py` code reads it
100
- via `os.environ["HF_TOKEN"]` (with `HUGGING_FACE_HUB_TOKEN` as a
101
- fallback for compatibility with the HF SDK's standard env name).
102
-
103
- ### 3. Push the code `[LOCAL]`
104
-
105
- The Space is a git repo of its own. From this checkout:
106
-
107
- ```bash
108
- # Replace <space-name> with the Space you created (e.g. nvidia/simready-validator)
109
- hf auth login # one-time
110
- git clone https://huggingface.co/spaces/<space-name> /tmp/space
111
- cp -r tools/hf_space/* /tmp/space/
112
- # Important: the Dockerfile COPYs tools/validation/ from the repo root,
113
- # so we have to vendor that subtree into the Space repo too.
114
- mkdir -p /tmp/space/tools
115
- cp -r tools/validation /tmp/space/tools/
116
- cd /tmp/space
117
- git add .
118
- git commit -m "Initial Space scaffold from simready-oem-library-pm@main"
119
- git push
120
- ```
121
-
122
- The Space will start building automatically. First build takes ~5 min
123
- (usd-core + omniverse-asset-validator wheels + the foundation clone).
124
- Subsequent builds reuse Docker layer cache and finish in ~1 min if only
125
- `app.py` or `runner.py` changed.
126
-
127
- ### 4. Smoke-test `[BROWSER]`
128
-
129
- Open the Space's URL. Enter a known-good dataset (the foundation
130
- clone's bundled examples work well — point at something small like a
131
- single-asset dataset first), pick **Robot-Body-Runnable**, leave
132
- **Open PR** unchecked, click **Validate**. Watch the log stream.
133
-
134
- Expected output ends with:
135
-
136
- ```
137
- PASS: 1/1 assets passed
138
- ```
139
-
140
- …and a downloadable `index.html` report.
141
-
142
- If the verdict makes sense, re-run with **Open PR** checked against a
143
- dataset you have write access to. A new PR appears on the dataset under
144
- `https://huggingface.co/datasets/<dataset>/discussions` with the
145
- verdict body + the `validation/` subtree.
146
-
147
- ---
148
-
149
- ## What this spike intentionally does NOT do
150
-
151
- To stay scoped to "prove the engine works on HF":
152
-
153
- - **No HF Hub webhook.** Triggering is Gradio-only. Phase 3.2 wires
154
- `https://<space>/api/run` to `dataset.commit.push` events.
155
- - **No status callback into this repo.** The DGXC dashboard's
156
- `hf-watch.yml` already polls dataset commits — once the Space lands
157
- verdicts as `validation/results.json` on the dataset, the existing
158
- watcher picks them up. No new integration needed.
159
- - **No Kit re-exec.** Profiles requiring PhysX/MDL rules currently
160
- report partial verdicts (the P2 patch drops env-blocked rules). A
161
- future iteration with an `a10g-small` tier + Isaac Sim wheels in the
162
- Dockerfile unlocks these.
163
- - **No multi-tenant token isolation.** The Space's own `HF_TOKEN` opens
164
- every PR. Fine for internal pilot; needs rework before exposing the
165
- Space outside NVIDIA.
166
-
167
- These are tracked in [PRD §7 — roadmap](../../PRD.md).
168
-
169
- ---
170
-
171
- ## Cutover criteria (when do we retire the DGXC runner?)
172
-
173
- Stop using `hf-validate.yml` once **all three** are true:
174
-
175
- 1. The Space verdict matches the DGXC verdict on the same dataset for
176
- the top three onboarded clients (imagineio kitchens, plus two TBD).
177
- Byte-for-byte equality on `results.json` is the bar.
178
- 2. The HF Hub webhook handler (phase 3.2) is in place and a customer
179
- can open a dataset PR and see a verdict comment without anyone at
180
- NVIDIA pressing a button.
181
- 3. The PRD §7 roadmap items 3.3 (auto-merge on pass) and 3.4 (block-on-fail
182
- gate at the GitHub side) are wired through, so the GitHub coordinator
183
- only deals with state — never validation.
184
-
185
- Until then both paths coexist; the DGXC runner is the source of truth.
 
1
+ ---
2
+ title: Simready Validator
3
+ emoji: 🏢
4
+ colorFrom: indigo
5
+ colorTo: gray
6
+ sdk: docker
7
+ pinned: false
8
+ ---
9
+
10
+ # SimReady Validator
11
+
12
+ Validates a HuggingFace dataset against a [SimReady](https://github.com/NVIDIA/simready-foundation) profile.
13
+
14
+ Reads the dataset directly from HF storage (no copy onto external
15
+ infrastructure), runs the bundled
16
+ [`simready-report`](./tools/validation/plugins/simready-report/) validator
17
+ with `--no-use-kit`, and when the **Open PR** checkbox is enabled
18
+ uploads the verdict to the dataset as a `validation/` pull request.
19
+
20
+ ## Source
21
+
22
+ This Space is generated from the
23
+ [`NVIDIA-dev/simready-oem-library-pm`](https://github.com/NVIDIA-dev/simready-oem-library-pm)
24
+ internal repo (see `tools/hf_space/`). Bug reports and feature requests go
25
+ there; this Space is the deployable artifact, not the source of truth.
26
+
27
+ ## Hardware
28
+
29
+ `cpu-basic` is enough for a smoke test (no Kit re-exec on this Space).
30
+ For routine validation of partner datasets at scale, upgrade to
31
+ `cpu-upgrade` (8 vCPU, 32 GB) — the validator's `ProcessPoolExecutor`
32
+ actually scales there. Kit-rooted PhysX/MDL rule coverage needs a GPU
33
+ tier (`a10g-small`) and Isaac Sim baked into the image, neither of
34
+ which is in the current image.
35
+
36
+ ## Secrets
37
+
38
+ | Name | Purpose |
39
+ |---|---|
40
+ | `HF_TOKEN` | Write-scoped Hugging Face token used to open verdict PRs on customer datasets. Set under Settings → Variables and secrets. |