| --- |
| title: Cognitive Load Manager |
| emoji: π§ |
| colorFrom: yellow |
| colorTo: red |
| sdk: docker |
| app_file: server/app.py |
| pinned: false |
| tags: [openenv, rl, scheduling, agent-eval, productivity, multi-agent, grpo, reinforcement-learning] |
| --- |
| |
| # π§ Cognitive Load Manager |
|
|
| > **An AI that schedules work like a *good manager* β one that actually cares if you're tired.** |
|
|
| [](#) |
| [](#) |
| [](#) |
|
|
|
|
| ## π₯ See It In 2 Minutes |
|
|
| | | | |
| |---|---| |
| | π¬ **Project walkthrough** | π [Watch on Loom](https://www.loom.com/share/7c7293efa0ba459ba2de243b0b5aacb2) | |
| | π **Live dashboard demo** | π [Watch the demo](https://drive.google.com/file/d/149dz_1rIlXv-eR1fwYaxRJ-cV0mQNevJ/view?usp=sharing) | |
|
|
|
|
| ## π€ The Problem |
|
|
| Most productivity tools tell you **what** to do. |
| None of them care **how you're feeling** while doing it. |
|
|
| - Running on 4 hours of sleep? Doesn't matter. |
| - Just finished three back-to-back meetings? Doesn't matter. |
| - Operating at 40% because the last task drained you? Doesn't matter. |
|
|
| Real performance isn't a straight line. Fatigue piles up. Stress carries over. Switching between tasks costs you more than you think. |
|
|
| **We built an AI that learns to notice all of that β and schedule around it.** |
|
|
|
|
| ## β¨ What Makes It Special |
|
|
| This is the moment that made the whole project worth it: |
|
|
| > **The AI started giving workers breaks *before* they burned out β not after.** |
| > |
| > Nobody told it to do that. It figured it out on its own. |
|
|
| That's the difference between a scheduler that optimizes hours and a manager that actually understands people. |
|
|
|
|
| ## π οΈ How It Works (In Plain English) |
|
|
| Imagine a simulated office with: |
|
|
| - π₯ **Three workers** β each with their own energy, stress, and fatigue |
| - π§βπΌ **One manager (the AI)** β deciding who does what, and when to call a break |
| - π **A pile of tasks** β emails, code reviews, reports, meetings, with real deadlines |
|
|
| The AI plays the manager role. Push too hard, workers burn out and quality crashes. Push too soft, deadlines slip. The AI has to find the sweet spot β and keep finding it as the day changes. |
|
|
| And the day **does** change. Mid-shift, a "Production server down!" alert can fire and suddenly every code review is critical. The AI has to adapt on the fly. |
|
|
|
|
| ## πΊοΈ How The Pieces Fit Together |
|
|
| ```mermaid |
| flowchart TB |
| AI["π§ <b>AI Manager</b><br/><i>Qwen 1.5B</i><br/>decides who does what"] |
| |
| subgraph SIM["π’ Simulated Workday"] |
| direction LR |
| W1["π€ <b>Worker 1</b><br/>energy Β· stress Β· fatigue"] |
| W2["π€ <b>Worker 2</b><br/>energy Β· stress Β· fatigue"] |
| W3["π€ <b>Worker 3</b><br/>energy Β· stress Β· fatigue"] |
| TP["π <b>Task Pool</b><br/>emails Β· reviews<br/>reports Β· meetings"] |
| EV["β‘ <b>Live Events</b><br/>deadline shifts<br/>urgent interrupts"] |
| end |
| |
| DASH["π <b>Live Dashboard</b><br/>watch it think<br/>in real time"] |
| |
| TR["π― <b>GRPO Training</b><br/><i>Hugging Face TRL</i><br/>1000 steps Β· +163% lift"] |
| |
| AI -- "assigns Β· focuses<br/>breaks Β· delays" --> SIM |
| SIM -- "observation +<br/>reward signal" --> AI |
| SIM -- "live state" --> DASH |
| AI -. "rollouts" .-> TR |
| TR -. "smarter weights" .-> AI |
| |
| classDef ai fill:#9b87f5,stroke:#5b3fc4,stroke-width:3px,color:#fff |
| classDef worker fill:#dbeafe,stroke:#3b82f6,stroke-width:2px,color:#1e3a8a |
| classDef task fill:#fce7f3,stroke:#ec4899,stroke-width:2px,color:#831843 |
| classDef event fill:#fee2e2,stroke:#ef4444,stroke-width:2px,color:#7f1d1d |
| classDef train fill:#d1fae5,stroke:#10b981,stroke-width:2px,color:#064e3b |
| classDef dash fill:#e0e7ff,stroke:#6366f1,stroke-width:2px,color:#312e81 |
| classDef sim fill:#fef9c3,stroke:#eab308,stroke-width:2px,color:#713f12 |
| |
| class AI ai |
| class W1,W2,W3 worker |
| class TP task |
| class EV event |
| class TR train |
| class DASH dash |
| class SIM sim |
| ``` |
|
|
| **The loop in plain English:** |
|
|
| 1. π§ **The AI looks** at the workday β who's tired, what's due, what just blew up. |
| 2. π― **It makes a call** β assign, focus, break, switch, or wait. |
| 3. π’ **The simulated office reacts** β workers gain progress or burn out, deadlines pass. |
| 4. β©οΈ **A reward comes back** β high if the call was wise, low if it wasn't. |
| 5. π **GRPO uses those rewards** to nudge the AI toward better decisions next time. |
|
|
| After 1000 loops, the AI is **5Γ better than random guessing**. |
|
|
|
|
| ## π The Results |
|
|
| After training the AI for 1000 steps: |
|
|
| | | Score | What it means | |
| |---|---|---| |
| | π² Random guessing | ~0.05 | Total chaos | |
| | π€ Untrained AI | 0.101 | Mediocre | |
| | β
**Our trained AI** | **0.265** | **5Γ better than random β +163% lift** | |
|
|
| What it learned without being told: |
|
|
| - βΈοΈ Insert breaks *before* burnout, not after |
| - π― Protect deep-focus time β don't yank workers off mid-task |
| - π¨ Adapt instantly when priorities flip mid-day |
|
|
| π [Watch the full dashboard demo](https://drive.google.com/file/d/149dz_1rIlXv-eR1fwYaxRJ-cV0mQNevJ/view?usp=sharing) |
|
|
|
|
| ## π Why This Matters |
|
|
| Today, AI tools schedule meetings and triage tickets β but they treat people like robots. CLM is a step toward AI that schedules **for humans, not over them**. |
|
|
| The same idea plugs into: |
|
|
| - π
**Work tools** β Slack, Linear, Notion that understand worker capacity |
| - π **Education** β tutors that notice when a student is overloaded, not just behind |
| - π₯ **Healthcare** β staff schedulers that catch fatigue before it becomes errors |
|
|
|
|
| ## π Try It |
|
|
| | | | |
| |---|---| |
| | π **Re-run our training in your browser** | π [Open in Colab](https://colab.research.google.com/drive/1_OoW4iH1acCni0H9POCcX2pp-6bOorzo?usp=sharing) | |
| | π€ **Live environment** | This Hugging Face Space | |
| | π **The full build story** | [`blog.md`](./blog.md) | |
|
|
|
|
| <details> |
| <summary><strong>π οΈ For Developers β Technical Details</strong></summary> |
|
|
| ### Stack |
|
|
| - **Environment:** OpenEnv-compatible RL environment (FastAPI backend, Docker) |
| - **Training:** Hugging Face TRL with GRPO on **Qwen 1.5B** |
| - **Frontend:** React live dashboard |
| - **Difficulty levels:** easy, medium, hard, expert (with deadlines, dependency chains, mid-episode interruptions) |
|
|
| ### Actions |
|
|
| | Action | Description | |
| |---|---| |
| | `work` | Work on a task at normal pace | |
| | `focus` | Deep-work mode: 2Γ progress, 2Γ energy cost | |
| | `break` | Rest: +energy, βstress | |
| | `switch` | Change active task (small penalty) | |
| | `delay` | Wait one step | |
|
|
| ### Scoring Formula |
|
|
| ``` |
| score = completionΓ0.60 + deadlineΓ0.22 + energyΓ0.10 + dependencyΓ0.05 + interruptionΓ0.03 |
| ``` |
|
|
| Score is always in (0.01, 0.99). |
|
|
| ### Quick Setup |
|
|
| ```bash |
| # Docker |
| docker build -t clm-env . && docker run -p 7860:7860 clm-env |
| |
| # Local |
| pip install -r requirements.txt |
| uvicorn server.app:app --port 7860 --reload |
| |
| # React dashboard |
| cd frontend && npm install && npm run dev |
| ``` |
|
|
| ### Environment Variables |
|
|
| | Variable | Description | |
| |---|---| |
| | `API_BASE_URL` | LLM API endpoint | |
| | `MODEL_NAME` | Model identifier | |
| | `HF_TOKEN` | Hugging Face API token | |
|
|
| ### Project Structure |
|
|
| ``` |
| cognitive-load-manager/ |
| βββ models.py β Core environment |
| βββ inference.py β Baseline LLM agent |
| βββ openenv.yaml β OpenEnv spec |
| βββ backend/main.py β FastAPI server |
| βββ grader/ β Difficulty graders |
| βββ frontend/ β React dashboard |
| ``` |
|
|
| For the full technical write-up β observation space, reward shaping table, training loop, and the v1βv2βv3 reward-tuning story β see [`blog.md`](./blog.md). |
|
|
| </details> |
|
|
|
|
| <p align="center"> |
| <em>Built for the OpenEnv Hackathon, April 2026.</em><br/> |
| <strong>π§ Scheduling that respects the humans doing the work.</strong> |
| </p> |
|
|