File size: 3,288 Bytes
b33d8e1
 
 
 
 
 
 
 
 
 
 
 
 
6195f6a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
---
title: Smart Factory Scheduling Environment
emoji: 🏭
colorFrom: blue
colorTo: green
sdk: docker
tags:
  - openenv
  - reinforcement-learning
  - scheduling
pinned: false
---

# Smart Factory Scheduling Environment

An [OpenEnv](https://github.com/openenv/openenv)-compliant RL environment simulating real-world industrial scheduling: assign jobs to machines, handle breakdowns, and maximise throughput within deadlines.

## Observation Space

| Field | Type | Description |
|-------|------|-------------|
| `machines` | List[Machine] | id, status (idle/busy/broken), current_job, failure_rate |
| `pending_jobs` | List[Job] | id, remaining_time, deadline, priority (1-3), assigned_machine |
| `completed_jobs` | List[Job] | Jobs finished this episode |
| `time` | int | Current time step |
| `max_steps` | int | Episode length |
| `done` | bool | Episode terminated |
| `reward` | float | Reward from last action |

## Action Space

| Action | Effect |
|--------|--------|
| `assign_job <job_id> <machine_id>` | Assign pending job to idle machine |
| `repair <machine_id>` | Restore broken machine to idle |
| `wait` | Advance time with no change |

## Reward Function

| Event | Reward |
|-------|--------|
| Job completed on time | +1.00 + 0.20 Γ— priority |
| Job completed late | +0.30 |
| Valid assignment | +0.10 |
| Invalid action | βˆ’0.10 |
| Idle machine (pending jobs exist) | βˆ’0.05 per machine |
| Job past deadline | βˆ’0.10 per step |
| Repair broken machine | +0.05 |

## Tasks

| Task | Machines | Jobs | Failure Rate | Max Steps | Baseline Score |
|------|----------|------|-------------|-----------|----------------|
| easy | 2 | 3 | 0% | 20 | 1.000 |
| medium | 4 | 7 | 8% | 30 | ~0.557 |
| hard | 6 | 12 | 15% | 40 | ~0.457 |

**Score formula:** `0.5 Γ— completion_rate + 0.3 Γ— on_time_rate + 0.2 Γ— utilization_bonus`

## Setup

```bash
pip install -r requirements.txt
```

### Run HTTP Server (HF Space)
```bash
python server.py
# Routes: GET /health  POST /reset  POST /step  GET /state  GET /schema
```

### Run Inference (LLM agent)
```bash
export OPENAI_API_KEY=<your-key>
export FACTORY_TASK=easy   # easy | medium | hard
python inference.py
```

### Run RL Training
```bash
python train.py --task easy --episodes 10 --provider openai
python train.py --task medium --episodes 10 --provider claude
```

### Interactive Demo
```bash
python app.py   # opens at http://localhost:7860
```

### Docker
```bash
docker build -t factory-env .
docker run -e OPENAI_API_KEY=<key> -e FACTORY_TASK=easy -p 7860:7860 factory-env
```

## Baseline Scores

| Task | Score | Steps |
|------|-------|-------|
| easy | 1.000 | 4 |
| medium | ~0.529 | 12 |
| hard | ~0.533 | 34 |

## Project Structure

```
β”œβ”€β”€ factory_env/
β”‚   β”œβ”€β”€ env.py       # FactoryEnv (openenv.core.Environment)
β”‚   β”œβ”€β”€ models.py    # FactoryAction, FactoryObservation, FactoryState
β”‚   β”œβ”€β”€ tasks.py     # Task configurations
β”‚   └── grader.py    # Score computation
β”œβ”€β”€ inference.py     # LLM baseline agent
β”œβ”€β”€ train.py         # Multi-episode RL training loop
β”œβ”€β”€ server.py        # FastAPI HTTP server for HF Space
β”œβ”€β”€ app.py           # Gradio interactive demo
β”œβ”€β”€ openenv.yaml     # OpenEnv metadata
└── Dockerfile
```