Spaces:

Aldrimore
/

RLScheduling

Sleeping

Aldrimore commited on Apr 10

Commit

f302e97

1 Parent(s): 7f61e7c

fix: add graders to openenv.yaml and clamp scores to (0, 1) exclusive

Phase 2 validator requires each task to declare a grader in openenv.yaml
and each grader to produce scores strictly between 0.0 and 1.0.

- openenv.yaml: add grader: factory_env.grader:score_episode to all 3 tasks
- grader.py: clamp compute_score output to [0.001, 0.999] — easy task was
returning exactly 1.0 (perfect baseline), which the validator rejected
- README: update easy baseline score from 1.000 to ~0.999
- Remove stale root server.py (entry point is server/app.py per pyproject.toml)

Files changed (4) hide show

README.md +2 -2
factory_env/grader.py +6 -2
openenv.yaml +3 -0
server.py +0 -31

README.md CHANGED Viewed

@@ -51,7 +51,7 @@ An [OpenEnv](https://github.com/openenv/openenv)-compliant RL environment simula
 | Task | Machines | Jobs | Failure Rate | Max Steps | Baseline Score |
 |------|----------|------|-------------|-----------|----------------|
-| easy | 2 | 3 | 0% | 20 | 1.000 |
 | medium | 4 | 7 | 8% | 30 | ~0.557 |
 | hard | 6 | 12 | 15% | 40 | ~0.457 |
@@ -97,7 +97,7 @@ docker run -e OPENAI_API_KEY=<key> -e FACTORY_TASK=easy -p 7860:7860 factory-env
 | Task | Score | Steps |
 |------|-------|-------|
-| easy | 1.000 | 4 |
 | medium | ~0.529 | 12 |
 | hard | ~0.533 | 34 |

 | Task | Machines | Jobs | Failure Rate | Max Steps | Baseline Score |
 |------|----------|------|-------------|-----------|----------------|
+| easy | 2 | 3 | 0% | 20 | ~0.999 |
 | medium | 4 | 7 | 8% | 30 | ~0.557 |
 | hard | 6 | 12 | 15% | 40 | ~0.457 |
 | Task | Score | Steps |
 |------|-------|-------|
+| easy | ~0.999 | 4 |
 | medium | ~0.529 | 12 |
 | hard | ~0.533 | 34 |

factory_env/grader.py CHANGED Viewed

@@ -1,11 +1,15 @@
 def compute_score(completed, on_time, total_jobs, late_jobs, task="easy"):
     if total_jobs == 0:
-        return 0.0
     completion_rate = completed / total_jobs
     on_time_rate = on_time / max(completed, 1)
     utilization_bonus = max(0.0, 1.0 - late_jobs / max(completed, 1))
     score = 0.5 * completion_rate + 0.3 * on_time_rate + 0.2 * utilization_bonus
-    return round(max(0.0, min(1.0, score)), 4)
 def score_episode(env) -> float:

+_SCORE_MIN = 0.001
+_SCORE_MAX = 0.999
 def compute_score(completed, on_time, total_jobs, late_jobs, task="easy"):
     if total_jobs == 0:
+        return _SCORE_MIN
     completion_rate = completed / total_jobs
     on_time_rate = on_time / max(completed, 1)
     utilization_bonus = max(0.0, 1.0 - late_jobs / max(completed, 1))
     score = 0.5 * completion_rate + 0.3 * on_time_rate + 0.2 * utilization_bonus
+    return round(max(_SCORE_MIN, min(_SCORE_MAX, score)), 4)
 def score_episode(env) -> float:

openenv.yaml CHANGED Viewed

@@ -7,10 +7,13 @@ entry_point: factory_env.env:FactoryEnv
 tasks:
   - name: easy
     description: 2 machines, 3 jobs, no failures, 20 steps
   - name: medium
     description: 4 machines, 7 jobs, 8% failure rate, 30 steps
   - name: hard
     description: 6 machines, 12 jobs, 15% failure rate, 40 steps
 action_space:
   type: text

 tasks:
   - name: easy
     description: 2 machines, 3 jobs, no failures, 20 steps
+    grader: factory_env.grader:score_episode
   - name: medium
     description: 4 machines, 7 jobs, 8% failure rate, 30 steps
+    grader: factory_env.grader:score_episode
   - name: hard
     description: 6 machines, 12 jobs, 15% failure rate, 40 steps
+    grader: factory_env.grader:score_episode
 action_space:
   type: text

server.py DELETED Viewed

@@ -1,31 +0,0 @@
-"""
-Smart Factory Scheduling — OpenEnv Server
-==========================================
-Routes (HTTP + WebSocket):
-  GET  /          →  Gradio UI (set ENABLE_WEB_INTERFACE=1) or redirect to /web
-  GET  /health    →  {"status": "healthy"}
-  POST /reset     →  reset episode, returns observation
-  POST /step      →  execute action, returns observation + reward + done
-  GET  /state     →  current environment state
-  GET  /schema    →  action / observation JSON schemas
-  WS   /ws        →  persistent WebSocket session (used by FactoryEnvClient)
-Set ENABLE_WEB_INTERFACE=1 to enable the built-in Gradio UI at /web.
-"""
-import os
-from openenv.core import create_app
-from factory_env.env import FactoryEnv
-from factory_env.models import FactoryAction, FactoryObservation
-TASK = os.getenv("FACTORY_TASK", "easy")
-app = create_app(
-    env=lambda: FactoryEnv(task=TASK, seed=42),
-    action_cls=FactoryAction,
-    observation_cls=FactoryObservation,
-    env_name="factory_env",
-)
-if __name__ == "__main__":
-    import uvicorn
-    uvicorn.run(app, host="0.0.0.0", port=int(os.getenv("PORT", 7860)))