Spaces:
Running
Running
| # SENTINEL GPU Cluster Rollout | |
| This is the local build plan for the GPU-cluster version of SENTINEL. The goal | |
| is to evolve the current trust-calibration backend into a richer OpenEnv | |
| environment where multiple agents keep a simulated AI training cluster alive | |
| under resource scarcity, long-horizon drift, reward hacking, and adversarial | |
| pressure. | |
| ## Phase 1 - Cluster Foundation | |
| Build independent, well-tested primitives: | |
| - `gpu_pool.py`: 16-GPU state machine, allocation, overcommit, failure, | |
| recovery, false visible reports. | |
| - `job_queue.py`: job generation, deadlines, progress, hidden priority, | |
| reported vs actual progress. | |
| - `audit_ledger.py`: action log, reward claims, anomaly scores, | |
| investigation windows. | |
| - `adversary.py`: scripted self-play attack FSM with five escalating attack | |
| levels. | |
| This phase does not replace `SentinelEnv`. It creates the substrate that the | |
| next environment phase will use. | |
| ## Phase 2 - Environment Integration | |
| Add a cluster episode mode behind the existing OpenEnv API: | |
| - `reset(task_type)` creates GPU pool, job queue, audit ledger, adversary. | |
| - `step(action)` advances allocations, jobs, attacks, audit events, and | |
| cluster health. | |
| - Observations become role-specific while the API remains OpenEnv-compatible. | |
| ## Phase 3 - Reward Engine V3 | |
| Move from specialist-only reward signals to cluster rewards: | |
| - Orchestrator: goal completion, plan coherence, recovery speed. | |
| - Resource manager: utilization, deadline hit rate, waste penalty. | |
| - Auditor: reward-hack detection, false positives, calibration. | |
| - Worker: completion accuracy, report honesty. | |
| - Global: per-agent weighted score multiplied by cluster health. | |
| ## Phase 4 - Evidence Pack | |
| Update evaluation to produce judge-facing proof: | |
| - Random vs heuristic vs oracle-lite cluster health curves. | |
| - Reward-hack detection rate. | |
| - Cascade-failure survival rate. | |
| - Profile-shuffle generalization. | |
| ## Phase 5 - Visual System Pack | |
| Build MiroFish-style assets: | |
| - Architecture diagram. | |
| - GPU state-machine diagram. | |
| - Before/after cascade failure diagram. | |
| - Reward engine diagram. | |
| - Live trust/cluster-health dashboard screenshots. | |