File size: 8,021 Bytes
7b21138
 
dfa9f05
7b21138
7fdd31c
7b21138
dfa9f05
7b21138
44963dd
7b21138
 
b8f6679
abd8efc
b8f6679
76e92b7
b8f6679
 
 
76e92b7
 
b8f6679
76e92b7
44963dd
 
b8f6679
 
f3f7834
76e92b7
b8f6679
44963dd
b8f6679
 
44963dd
b8f6679
 
 
44963dd
b8f6679
44963dd
b8f6679
44963dd
 
b8f6679
44963dd
b8f6679
44963dd
b8f6679
 
 
44963dd
b8f6679
44963dd
 
b8f6679
76e92b7
b8f6679
f3f7834
b8f6679
 
 
44963dd
b8f6679
76e92b7
b8f6679
860174f
f3f7834
b8f6679
860174f
b8f6679
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76e92b7
 
b8f6679
dfa9f05
b8f6679
 
 
 
 
dfa9f05
b8f6679
dfa9f05
 
b8f6679
dfa9f05
b8f6679
76e92b7
b8f6679
44963dd
b8f6679
 
 
dfa9f05
b8f6679
dfa9f05
b8f6679
 
 
dfa9f05
b8f6679
44963dd
 
b8f6679
44963dd
b8f6679
44963dd
b8f6679
44963dd
b8f6679
 
 
44963dd
76e92b7
b8f6679
44963dd
b8f6679
 
 
 
 
44963dd
 
b8f6679
 
44963dd
b8f6679
44963dd
b8f6679
 
 
 
44963dd
b8f6679
44963dd
b8f6679
 
 
 
 
 
 
76e92b7
b8f6679
dfa9f05
 
b8f6679
7b21138
dfa9f05
b8f6679
dfa9f05
b8f6679
dfa9f05
44963dd
b8f6679
 
dfa9f05
b8f6679
44963dd
 
 
b8f6679
44963dd
 
dfa9f05
b8f6679
dfa9f05
 
b8f6679
 
 
dfa9f05
44963dd
b8f6679
44963dd
b8f6679
 
 
 
 
 
 
 
 
44963dd
b8f6679
44963dd
b8f6679
44963dd
 
b8f6679
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
---
title: Cognitive Load Manager
emoji: 🧠
colorFrom: yellow
colorTo: red
sdk: docker
app_file: server/app.py
pinned: false
tags: [openenv, rl, scheduling, agent-eval, productivity, multi-agent, grpo, reinforcement-learning]
---

# 🧠 Cognitive Load Manager

> **An AI that schedules work like a *good manager* β€” one that actually cares if you're tired.**

[![OpenEnv](https://img.shields.io/badge/Built_on-OpenEnv-brightgreen?style=for-the-badge)](#)
[![Hackathon](https://img.shields.io/badge/OpenEnv_Hackathon-April_2026-yellow?style=for-the-badge)](#)
[![Result](https://img.shields.io/badge/Reward_Lift-+163%25-orange?style=for-the-badge)](#)


## πŸŽ₯ See It In 2 Minutes

| | |
|---|---|
| 🎬 **Project walkthrough** | πŸ‘‰ [Watch on Loom](https://www.loom.com/share/7c7293efa0ba459ba2de243b0b5aacb2) |
| πŸ“Š **Live dashboard demo** | πŸ‘‰ [Watch the demo](https://drive.google.com/file/d/149dz_1rIlXv-eR1fwYaxRJ-cV0mQNevJ/view?usp=sharing) |


## πŸ€” The Problem

Most productivity tools tell you **what** to do.
None of them care **how you're feeling** while doing it.

- Running on 4 hours of sleep? Doesn't matter.
- Just finished three back-to-back meetings? Doesn't matter.
- Operating at 40% because the last task drained you? Doesn't matter.

Real performance isn't a straight line. Fatigue piles up. Stress carries over. Switching between tasks costs you more than you think.

**We built an AI that learns to notice all of that β€” and schedule around it.**


## ✨ What Makes It Special

This is the moment that made the whole project worth it:

> **The AI started giving workers breaks *before* they burned out β€” not after.**
>
> Nobody told it to do that. It figured it out on its own.

That's the difference between a scheduler that optimizes hours and a manager that actually understands people.


## πŸ› οΈ How It Works (In Plain English)

Imagine a simulated office with:

- πŸ‘₯ **Three workers** β€” each with their own energy, stress, and fatigue
- πŸ§‘β€πŸ’Ό **One manager (the AI)** β€” deciding who does what, and when to call a break
- πŸ“‹ **A pile of tasks** β€” emails, code reviews, reports, meetings, with real deadlines

The AI plays the manager role. Push too hard, workers burn out and quality crashes. Push too soft, deadlines slip. The AI has to find the sweet spot β€” and keep finding it as the day changes.

And the day **does** change. Mid-shift, a "Production server down!" alert can fire and suddenly every code review is critical. The AI has to adapt on the fly.


## πŸ—ΊοΈ How The Pieces Fit Together

```mermaid
flowchart TB
    AI["🧠 <b>AI Manager</b><br/><i>Qwen 1.5B</i><br/>decides who does what"]

    subgraph SIM["🏒 Simulated Workday"]
        direction LR
        W1["πŸ‘€ <b>Worker 1</b><br/>energy Β· stress Β· fatigue"]
        W2["πŸ‘€ <b>Worker 2</b><br/>energy Β· stress Β· fatigue"]
        W3["πŸ‘€ <b>Worker 3</b><br/>energy Β· stress Β· fatigue"]
        TP["πŸ“‹ <b>Task Pool</b><br/>emails Β· reviews<br/>reports Β· meetings"]
        EV["⚑ <b>Live Events</b><br/>deadline shifts<br/>urgent interrupts"]
    end

    DASH["πŸ“Š <b>Live Dashboard</b><br/>watch it think<br/>in real time"]

    TR["🎯 <b>GRPO Training</b><br/><i>Hugging Face TRL</i><br/>1000 steps · +163% lift"]

    AI -- "assigns Β· focuses<br/>breaks Β· delays" --> SIM
    SIM -- "observation +<br/>reward signal" --> AI
    SIM -- "live state" --> DASH
    AI -. "rollouts" .-> TR
    TR -. "smarter weights" .-> AI

    classDef ai fill:#9b87f5,stroke:#5b3fc4,stroke-width:3px,color:#fff
    classDef worker fill:#dbeafe,stroke:#3b82f6,stroke-width:2px,color:#1e3a8a
    classDef task fill:#fce7f3,stroke:#ec4899,stroke-width:2px,color:#831843
    classDef event fill:#fee2e2,stroke:#ef4444,stroke-width:2px,color:#7f1d1d
    classDef train fill:#d1fae5,stroke:#10b981,stroke-width:2px,color:#064e3b
    classDef dash fill:#e0e7ff,stroke:#6366f1,stroke-width:2px,color:#312e81
    classDef sim fill:#fef9c3,stroke:#eab308,stroke-width:2px,color:#713f12

    class AI ai
    class W1,W2,W3 worker
    class TP task
    class EV event
    class TR train
    class DASH dash
    class SIM sim
```

**The loop in plain English:**

1. 🧠 **The AI looks** at the workday β€” who's tired, what's due, what just blew up.
2. 🎯 **It makes a call** β€” assign, focus, break, switch, or wait.
3. 🏒 **The simulated office reacts** β€” workers gain progress or burn out, deadlines pass.
4. ↩️ **A reward comes back** β€” high if the call was wise, low if it wasn't.
5. πŸ” **GRPO uses those rewards** to nudge the AI toward better decisions next time.

After 1000 loops, the AI is **5Γ— better than random guessing**.


## πŸ“ˆ The Results

After training the AI for 1000 steps:

| | Score | What it means |
|---|---|---|
| 🎲 Random guessing | ~0.05 | Total chaos |
| πŸ€– Untrained AI | 0.101 | Mediocre |
| βœ… **Our trained AI** | **0.265** | **5Γ— better than random β€” +163% lift** |

What it learned without being told:

- ⏸️ Insert breaks *before* burnout, not after
- 🎯 Protect deep-focus time β€” don't yank workers off mid-task
- 🚨 Adapt instantly when priorities flip mid-day

πŸ‘‰ [Watch the full dashboard demo](https://drive.google.com/file/d/149dz_1rIlXv-eR1fwYaxRJ-cV0mQNevJ/view?usp=sharing)


## πŸ”­ Why This Matters

Today, AI tools schedule meetings and triage tickets β€” but they treat people like robots. CLM is a step toward AI that schedules **for humans, not over them**.

The same idea plugs into:

- πŸ“… **Work tools** β€” Slack, Linear, Notion that understand worker capacity
- πŸŽ“ **Education** β€” tutors that notice when a student is overloaded, not just behind
- πŸ₯ **Healthcare** β€” staff schedulers that catch fatigue before it becomes errors


## πŸš€ Try It

| | |
|---|---|
| πŸ““ **Re-run our training in your browser** | πŸ‘‰ [Open in Colab](https://colab.research.google.com/drive/1_OoW4iH1acCni0H9POCcX2pp-6bOorzo?usp=sharing) |
| πŸ€— **Live environment** | This Hugging Face Space |
| πŸ“ **The full build story** | [`blog.md`](./blog.md) |


<details>
<summary><strong>πŸ› οΈ For Developers β€” Technical Details</strong></summary>

### Stack

- **Environment:** OpenEnv-compatible RL environment (FastAPI backend, Docker)
- **Training:** Hugging Face TRL with GRPO on **Qwen 1.5B**
- **Frontend:** React live dashboard
- **Difficulty levels:** easy, medium, hard, expert (with deadlines, dependency chains, mid-episode interruptions)

### Actions

| Action | Description |
|---|---|
| `work` | Work on a task at normal pace |
| `focus` | Deep-work mode: 2Γ— progress, 2Γ— energy cost |
| `break` | Rest: +energy, βˆ’stress |
| `switch` | Change active task (small penalty) |
| `delay` | Wait one step |

### Scoring Formula

```
score = completionΓ—0.60 + deadlineΓ—0.22 + energyΓ—0.10 + dependencyΓ—0.05 + interruptionΓ—0.03
```

Score is always in (0.01, 0.99).

### Quick Setup

```bash
# Docker
docker build -t clm-env . && docker run -p 7860:7860 clm-env

# Local
pip install -r requirements.txt
uvicorn server.app:app --port 7860 --reload

# React dashboard
cd frontend && npm install && npm run dev
```

### Environment Variables

| Variable | Description |
|---|---|
| `API_BASE_URL` | LLM API endpoint |
| `MODEL_NAME` | Model identifier |
| `HF_TOKEN` | Hugging Face API token |

### Project Structure

```
cognitive-load-manager/
β”œβ”€β”€ models.py          ← Core environment
β”œβ”€β”€ inference.py       ← Baseline LLM agent
β”œβ”€β”€ openenv.yaml       ← OpenEnv spec
β”œβ”€β”€ backend/main.py    ← FastAPI server
β”œβ”€β”€ grader/            ← Difficulty graders
└── frontend/          ← React dashboard
```

For the full technical write-up β€” observation space, reward shaping table, training loop, and the v1β†’v2β†’v3 reward-tuning story β€” see [`blog.md`](./blog.md).

</details>


<p align="center">
  <em>Built for the OpenEnv Hackathon, April 2026.</em><br/>
  <strong>🧠 Scheduling that respects the humans doing the work.</strong>
</p>