Spaces:

CreativeEngineer
/

fusion-design-lab

Paused

CreativeEngineer commited on Mar 8

Commit

567ff67

1 Parent(s): 1c1f314

docs: align hybrid validation gates

Files changed (2) hide show

README.md CHANGED Viewed

@@ -52,7 +52,7 @@ Implementation status:
 - [x] Label low-fi `run` truth vs high-fi `submit` truth in observations and task docs
 - [x] Separate high-fidelity submit scoring/reporting from low-fidelity rollout score state
 - [x] Add tracked `P1` fixtures under `server/data/p1/`
-- [ ] Run manual playtesting and record the first reward pathology
 - [ ] Refresh the heuristic baseline for the real verifier path
 - [ ] Deploy the real environment to HF Space

 - [x] Label low-fi `run` truth vs high-fi `submit` truth in observations and task docs
 - [x] Separate high-fidelity submit scoring/reporting from low-fidelity rollout score state
 - [x] Add tracked `P1` fixtures under `server/data/p1/`
+- [ ] Run a tiny low-fi PPO smoke run, then record at least one submit-side manual trace and the first real reward pathology
 - [ ] Refresh the heuristic baseline for the real verifier path
 - [ ] Deploy the real environment to HF Space

docs/FUSION_DESIGN_LAB_PLAN_V2.md CHANGED Viewed

@@ -151,15 +151,15 @@ Gate 1: measured sweep exists
 - repaired-family ranges, deltas, and reset seeds are justified by recorded evidence
-Gate 2: fixture checks pass
-- good, boundary, and bad references behave as expected
-Gate 3: tiny PPO smoke is sane
 - a small low-fidelity policy can improve or at least reveal a concrete failure mode quickly
 - trajectories are readable enough to debug
 Gate 4: manual playtest passes
 - a human can read the observation

 - repaired-family ranges, deltas, and reset seeds are justified by recorded evidence
+Gate 2: tiny PPO smoke is sane
 - a small low-fidelity policy can improve or at least reveal a concrete failure mode quickly
 - trajectories are readable enough to debug
+Gate 3: fixture checks pass
+- good, boundary, and bad references behave as expected
 Gate 4: manual playtest passes
 - a human can read the observation