Vittal-M commited on
Commit
e3d838a
Β·
verified Β·
1 Parent(s): 730b455

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +354 -10
README.md CHANGED
@@ -1,10 +1,354 @@
1
- ---
2
- title: Openenv Hackathon
3
- emoji: πŸ“ˆ
4
- colorFrom: green
5
- colorTo: green
6
- sdk: docker
7
- pinned: false
8
- ---
9
-
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: SchedulingOptEnv
3
+ emoji: πŸ—“οΈ
4
+ colorFrom: blue
5
+ colorTo: green
6
+ sdk: docker
7
+ app_port: 7860
8
+ tags:
9
+ - openenv
10
+ - reinforcement-learning
11
+ - scheduling
12
+ - agent
13
+ license: mit
14
+ ---
15
+
16
+ <h1 align="center">SchedulingOptEnv</h1>
17
+ <h3 align="center">A Markov Decision Environment for Training Autonomous<br>Scheduling Optimisation Agents</h3>
18
+
19
+ <p align="center"><em>Meta Γ— Scaler β€” OpenEnv Hackathon Submission</em></p>
20
+
21
+ <p align="center">
22
+ <img src="https://img.shields.io/badge/python-3.11+-blue" alt="Python 3.11+">
23
+ <img src="https://img.shields.io/badge/framework-FastAPI-009688" alt="FastAPI">
24
+ <img src="https://img.shields.io/badge/models-Pydantic%20v2-e92063" alt="Pydantic v2">
25
+ <img src="https://img.shields.io/badge/deploy-Docker%20%7C%20HF%20Spaces-yellow" alt="Docker | HF Spaces">
26
+ <img src="https://img.shields.io/badge/license-MIT-green" alt="MIT License">
27
+ </p>
28
+
29
+ ---
30
+
31
+ ## Abstract
32
+
33
+ We present **SchedulingOptEnv**, a real-world training environment for autonomous AI agents built upon the OpenEnv framework. The environment formalises combinatorial scheduling optimisation as a sequential decision problem, exposing agents to three progressively challenging sub-tasks: binary feasibility determination, multi-class constraint-violation classification, and full schedule repair. Each task is paired with a structured, differentiable reward function that provides dense, partial-progress signals rather than sparse binary outcomes. A 12-instance scheduling corpus covering five distinct constraint-violation classes, a FastAPI inference server, and a GPT-4o-mini baseline are included. The environment is deployable as a Docker container on Hugging Face Spaces with a single command.
34
+
35
+ ---
36
+
37
+ ## 1. Introduction
38
+
39
+ Combinatorial scheduling β€” the assignment of jobs to machines subject to resource, temporal, and precedence constraints β€” is a foundational problem in operations research, manufacturing, cloud computing, and logistics. Despite its industrial importance, existing benchmarks for evaluating AI agents on scheduling tasks are either purely offline (single-pass solution quality) or narrowly scoped to continuous optimisation rather than the constraint-satisfaction and repair workflow practised by human planners.
40
+
41
+ OpenEnv [1] provides an abstraction layer for building *interactive* environments where agents act, receive graded feedback, and improve across episodes. SchedulingOptEnv fills a gap by framing schedule analysis and repair as a Markov Decision Process (MDP) with:
42
+
43
+ - A well-defined **observation space** (JSON-encoded scheduling instance, task context, step counter)
44
+ - A structured **action space** (categorical labels or JSON repair schedules)
45
+ - A **multi-component reward function** that awards partial credit for structurally valid but suboptimal repairs
46
+ - Three **difficulty tiers** mirroring the cognitive complexity gradient faced by human schedulers
47
+
48
+ ---
49
+
50
+ ## 2. Environment Design
51
+
52
+ ### 2.1 MDP Formulation
53
+
54
+ | Component | Definition |
55
+ |-----------|-----------|
56
+ | State *S* | Current scheduling instance, task type, step count, episode history |
57
+ | Observation *O* | `{schedule_instance: str (JSON), task_id, context, step_number}` |
58
+ | Action *A* | `{response: str, task_id: str}` |
59
+ | Reward *R* | Float ∈ [0.0, 1.0] from task-specific grader |
60
+ | Horizon *T* | Task-dependent: 3 / 5 / 8 steps |
61
+ | Terminal | *done* = True when *T* reached or *R* β‰₯ 0.95 |
62
+
63
+ ### 2.2 Scheduling Instance Corpus
64
+
65
+ The environment ships with **12 curated scheduling instances** spanning five constraint-violation classes plus two fully feasible baselines. Instances are drawn from a task-aware pool: feasibility-check episodes see all 12, while classification and repair episodes see only the 10 infeasible instances.
66
+
67
+ | # | Feasible | Violation Class | Description |
68
+ |---|----------|----------------|-------------|
69
+ | 0 | No | `resource_overload` | J1 and J2 overlap on single-capacity machine M1 |
70
+ | 1 | No | `deadline_violation` | J1 starts late and finishes after hard deadline |
71
+ | 2 | No | `precedence_violation` | J2 starts before its predecessor J1 finishes |
72
+ | 3 | No | `availability_conflict` | J1 scheduled outside machine operating hours |
73
+ | 4 | No | `capacity_exceeded` | 3 concurrent jobs on capacity-2 machine |
74
+ | 5 | No | `resource_overload` | Pairwise overlap of J1 and J2 on capacity-1 machine |
75
+ | 6 | No | `deadline_violation` | Precedence chain forces J3 past hard deadline |
76
+ | 7 | No | `precedence_violation` | J3 starts before both predecessors complete |
77
+ | 8 | No | `availability_conflict` | J1 extends into machine maintenance window |
78
+ | 9 | No | `capacity_exceeded` | 4 concurrent jobs on capacity-3 machine |
79
+ | 10 | Yes | β€” | Fully feasible 3-job, 2-machine schedule |
80
+ | 11 | Yes | β€” | Fully feasible 5-job, 3-machine schedule with precedence |
81
+
82
+ ---
83
+
84
+ ## 3. Tasks
85
+
86
+ ### Task 1 β€” Feasibility Check *(Easy)*
87
+
88
+ **Objective:** Given a JSON-encoded scheduling instance (jobs, machines, proposed assignments), determine whether the schedule satisfies all constraints.
89
+
90
+ **Action space:** `{"feasible", "infeasible"}`
91
+
92
+ **Grading function:**
93
+
94
+ ```
95
+ R(a, g) = 1.0 if normalise(a) == ground_truth
96
+ 0.1 if a is non-empty but incorrect
97
+ 0.0 if a is empty
98
+ ```
99
+
100
+ **Episode horizon:** 3 steps. **Target agent accuracy:** ~90%.
101
+
102
+ ---
103
+
104
+ ### Task 2 β€” Conflict Classification *(Medium)*
105
+
106
+ **Objective:** Identify the constraint violation present in an infeasible schedule from the closed vocabulary:
107
+ `{resource_overload, deadline_violation, precedence_violation, availability_conflict, capacity_exceeded}`
108
+
109
+ **Grading function:**
110
+
111
+ ```
112
+ R(a, g) = 1.0 if a == ground_truth (exact)
113
+ 0.5 if a ∈ related_group(ground_truth) (partial)
114
+ 0.1 if a ∈ valid_categories \ related_group(g) (wrong family)
115
+ 0.0 if a βˆ‰ valid_categories (unparseable)
116
+ ```
117
+
118
+ where `related_groups = [{resource_overload, capacity_exceeded}, {deadline_violation, precedence_violation}]`.
119
+
120
+ **Episode horizon:** 5 steps. **Target agent accuracy:** ~60%.
121
+
122
+ ---
123
+
124
+ ### Task 3 β€” Schedule Repair *(Hard)*
125
+
126
+ **Objective:** Return a corrected schedule as a JSON object that resolves all constraint violations and minimises total makespan.
127
+
128
+ **Required JSON format:**
129
+ ```json
130
+ {
131
+ "assignments": [
132
+ {"job_id": "J1", "machine_id": "M1", "start_time": 0},
133
+ {"job_id": "J2", "machine_id": "M1", "start_time": 4}
134
+ ]
135
+ }
136
+ ```
137
+
138
+ **Grading function (additive, max 1.0):**
139
+
140
+ ```
141
+ R(a, g) = 0.2 Γ— parseable_json(a)
142
+ + 0.2 Γ— valid_schema(a, g)
143
+ + 0.4 Γ— constraint_satisfaction_ratio(a, g)
144
+ + 0.2 Γ— optimality_score(makespan(a), makespan*(g))
145
+ ```
146
+
147
+ where:
148
+ - `parseable_json(a)` β€” 1 if the response parses as valid JSON, else 0
149
+ - `valid_schema(a, g)` β€” 1 if all required fields are present and all jobs are assigned, else 0
150
+ - `constraint_satisfaction_ratio(a, g)` β€” fraction of four constraint categories satisfied:
151
+ capacity, deadlines, precedence, availability (each worth 0.25)
152
+ - `optimality_score(m, m*)` β€” 1.0 if *m* ≀ 1.30Β·*m** ; 0.5 if *m* ≀ 1.60Β·*m** ; 0 otherwise
153
+
154
+ **Episode horizon:** 8 steps. **Target agent accuracy:** ~30%.
155
+
156
+ ---
157
+
158
+ ## 4. Server API
159
+
160
+ The environment is exposed over HTTP via a FastAPI server on port **7860** (Hugging Face Spaces default).
161
+
162
+ | Method | Endpoint | Description |
163
+ |--------|----------|-------------|
164
+ | `GET` | `/health` | Liveness probe β€” returns `{"status": "ok"}` |
165
+ | `POST` | `/reset` | Begin new episode: `{"task_id": "feasibility_check"}` |
166
+ | `POST` | `/step` | Submit action: `{"response": "infeasible", "task_id": "feasibility_check"}` |
167
+ | `GET` | `/state` | Full internal state snapshot |
168
+ | `GET` | `/tasks` | Task catalogue with action schemas |
169
+ | `POST` | `/grader` | Direct grader invocation for offline evaluation |
170
+ | `GET` | `/baseline` | Trigger baseline inference; returns per-task scores |
171
+
172
+ ---
173
+
174
+ ## 5. Baseline
175
+
176
+ A standalone inference script (`baseline.py`) evaluates GPT-4o-mini on all three tasks. When `OPENAI_API_KEY` is not set, the script falls back to oracle mock responses, enabling offline verification of the grading pipeline without API access.
177
+
178
+ ### 5.1 Baseline Scores (Mock / Oracle)
179
+
180
+ | Task | Instances | Average Score |
181
+ |------|-----------|--------------|
182
+ | Feasibility Check | 12 | 1.000 |
183
+ | Conflict Classification | 10 | 1.000 |
184
+ | Schedule Repair | 10 | 1.000 |
185
+ | **Overall** | | **1.000** |
186
+
187
+ ---
188
+
189
+ ## 6. Setup and Deployment
190
+
191
+ ### 6.1 Prerequisites
192
+
193
+ | Requirement | Version |
194
+ |-------------|---------|
195
+ | Python | β‰₯ 3.11 |
196
+ | pip | β‰₯ 22.0 |
197
+ | Docker *(optional)* | β‰₯ 20.10 |
198
+ | Git | β‰₯ 2.30 |
199
+
200
+ ### 6.2 Local Installation
201
+
202
+ ```bash
203
+ # 1. Clone the repository
204
+ git clone https://github.com/Vittal-Mukunda/OpenEnv-Hackathon-Meta-x-Scaler.git
205
+ cd OpenEnv-Hackathon-Meta-x-Scaler
206
+
207
+ # 2. Create and activate a virtual environment (recommended)
208
+ python -m venv .venv
209
+ source .venv/bin/activate # Linux / macOS
210
+ # .venv\Scripts\activate # Windows
211
+
212
+ # 3. Install dependencies
213
+ pip install -r requirements.txt
214
+
215
+ # 4. Launch the server
216
+ uvicorn server:app --host 0.0.0.0 --port 7860
217
+
218
+ # 5. Verify the server is running
219
+ curl http://localhost:7860/health
220
+ # Expected: {"status":"ok"}
221
+ ```
222
+
223
+ ### 6.3 Docker Deployment
224
+
225
+ ```bash
226
+ # Build the image
227
+ docker build -t scheduling-opt-env .
228
+
229
+ # Run the container
230
+ docker run -p 7860:7860 scheduling-opt-env
231
+
232
+ # Verify
233
+ curl http://localhost:7860/health
234
+ ```
235
+
236
+ ### 6.4 Hugging Face Spaces
237
+
238
+ Push this repository to a Hugging Face Space configured with the **Docker** SDK. The server listens on port 7860, which Spaces exposes automatically. No additional configuration is required.
239
+
240
+ ### 6.5 Running the Baseline
241
+
242
+ ```bash
243
+ # Without API key (uses oracle mock responses β€” scores 1.0 on all tasks)
244
+ python baseline.py
245
+
246
+ # With OpenAI API key (evaluates GPT-4o-mini)
247
+ export OPENAI_API_KEY=sk-...
248
+ python baseline.py
249
+ ```
250
+
251
+ ---
252
+
253
+ ## 7. Example Interaction
254
+
255
+ ```bash
256
+ # 1. Health check
257
+ curl http://localhost:7860/health
258
+
259
+ # 2. Start a feasibility-check episode
260
+ curl -X POST http://localhost:7860/reset \
261
+ -H "Content-Type: application/json" \
262
+ -d '{"task_id": "feasibility_check"}'
263
+
264
+ # 3. Submit a feasibility answer
265
+ curl -X POST http://localhost:7860/step \
266
+ -H "Content-Type: application/json" \
267
+ -d '{"response": "infeasible", "task_id": "feasibility_check"}'
268
+
269
+ # 4. Start a conflict-classification episode
270
+ curl -X POST http://localhost:7860/reset \
271
+ -H "Content-Type: application/json" \
272
+ -d '{"task_id": "conflict_classification"}'
273
+
274
+ # 5. Classify the violation
275
+ curl -X POST http://localhost:7860/step \
276
+ -H "Content-Type: application/json" \
277
+ -d '{"response": "resource_overload", "task_id": "conflict_classification"}'
278
+
279
+ # 6. Start a schedule-repair episode
280
+ curl -X POST http://localhost:7860/reset \
281
+ -H "Content-Type: application/json" \
282
+ -d '{"task_id": "schedule_repair"}'
283
+
284
+ # 7. Submit a repaired schedule
285
+ curl -X POST http://localhost:7860/step \
286
+ -H "Content-Type: application/json" \
287
+ -d '{
288
+ "response": "{\"assignments\": [{\"job_id\": \"J1\", \"machine_id\": \"M1\", \"start_time\": 0}]}",
289
+ "task_id": "schedule_repair"
290
+ }'
291
+
292
+ # 8. Inspect environment state
293
+ curl http://localhost:7860/state
294
+
295
+ # 9. Invoke a grader directly
296
+ curl -X POST http://localhost:7860/grader \
297
+ -H "Content-Type: application/json" \
298
+ -d '{
299
+ "action": {"response": "deadline_violation", "task_id": "conflict_classification"},
300
+ "ground_truth": {"violation_type": "deadline_violation"}
301
+ }'
302
+ ```
303
+
304
+ ---
305
+
306
+ ## 8. Project Structure
307
+
308
+ ```
309
+ .
310
+ β”œβ”€β”€ openenv.yaml # OpenEnv metadata manifest
311
+ β”œβ”€β”€ models.py # Pydantic v2 data models (Observation, Action, Reward)
312
+ β”œβ”€β”€ environment.py # SchedulingOptEnv core (reset / step / state + instance bank)
313
+ β”œβ”€β”€ server.py # FastAPI HTTP server (7 endpoints)
314
+ β”œβ”€β”€ baseline.py # GPT-4o-mini baseline with oracle fallback
315
+ β”œβ”€β”€ Dockerfile # Container definition (python:3.11-slim, port 7860)
316
+ β”œβ”€β”€ requirements.txt # Python dependencies
317
+ β”œβ”€β”€ tasks/
318
+ β”‚ β”œβ”€β”€ __init__.py # Task module exports
319
+ β”‚ β”œβ”€β”€ task1_easy.py # Feasibility check β€” episode runner + instance accessor
320
+ β”‚ β”œβ”€β”€ task2_medium.py # Conflict classification β€” episode runner + instance accessor
321
+ β”‚ └── task3_hard.py # Schedule repair β€” episode runner + instance accessor
322
+ └── graders/
323
+ β”œβ”€β”€ __init__.py # Grader exports (FeasibilityGrader, ConflictGrader, RepairGrader)
324
+ β”œβ”€β”€ grader_detection.py # Grader: feasibility (binary, synonym-aware)
325
+ β”œβ”€β”€ grader_classification.py # Grader: conflict classification (family-aware partial credit)
326
+ └── grader_fix.py # Grader: schedule repair (4-component additive reward)
327
+ ```
328
+
329
+ ---
330
+
331
+ ## 9. Dependencies
332
+
333
+ | Package | Version | Purpose |
334
+ |---------|---------|---------|
335
+ | `fastapi` | β‰₯ 0.104 | HTTP server framework |
336
+ | `uvicorn` | β‰₯ 0.24 | ASGI server |
337
+ | `pydantic` | β‰₯ 2.5 | Data validation and serialisation |
338
+ | `openai` | β‰₯ 1.6 | LLM baseline inference |
339
+ | `pyyaml` | β‰₯ 6.0 | YAML manifest parsing |
340
+ | `httpx` | β‰₯ 0.25 | Async HTTP client |
341
+
342
+ ---
343
+
344
+ ## 10. References
345
+
346
+ [1] OpenEnv Framework. *Building Real-World AI Agent Training Environments*. Meta Γ— Scaler Hackathon, 2026.
347
+
348
+ [2] Pinedo, M. L. *Scheduling: Theory, Algorithms, and Systems* (5th ed.). Springer, 2016.
349
+
350
+ [3] Garey, M. R., & Johnson, D. S. *Computers and Intractability: A Guide to the Theory of NP-Completeness*. W. H. Freeman, 1979.
351
+
352
+ [4] Zhang, C. et al. *Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning*. NeurIPS 2020.
353
+
354
+ [5] Kwon, Y.-D. et al. *POMO: Policy Optimization with Multiple Optima for Reinforcement Learning*. NeurIPS 2020.