DevanshuDon commited on
Commit
3ca1a90
Β·
verified Β·
1 Parent(s): 9b75fb2

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +164 -0
README.md ADDED
@@ -0,0 +1,164 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: ExecAssist
3
+ emoji: πŸ“§
4
+ colorFrom: indigo
5
+ colorTo: blue
6
+ sdk: docker
7
+ app_port: 7860
8
+ pinned: false
9
+ license: mit
10
+ tags:
11
+ - openenv
12
+ - rl
13
+ - executive-assistant
14
+ ---
15
+
16
+ # ExecAssist β€” Executive Assistant Environment
17
+
18
+ An OpenEnv environment where AI agents learn to manage email and calendar for busy executives.
19
+
20
+ ## Problem Statement
21
+
22
+ Every executive assistant juggles email, calendars, and scheduling conflicts daily. This environment simulates that exact challenge: read incoming requests, draft professional replies, book meetings, and resolve conflicts intelligently.
23
+
24
+ **Theme:** #3.2 - World Modeling (Personalized Tasks)
25
+
26
+ ## Tasks
27
+
28
+ ### Task 1: Easy β€” Simple Meeting Request
29
+ - **Challenge:** Single email with clear calendar availability
30
+ - **Agent must:** Draft polite reply + book meeting in open slot
31
+ - **Score:** 50% email quality + 50% scheduling correctness
32
+
33
+ ### Task 2: Medium β€” Scheduling Conflict
34
+ - **Challenge:** Requested time is already booked
35
+ - **Agent must:** Identify conflict + propose 2-3 alternatives + explain professionally
36
+ - **Score:** 30% email quality + 40% conflict resolution + 30% scheduling
37
+
38
+ ### Task 3: Hard β€” Multi-Party Coordination
39
+ - **Challenge:** 3 emails requesting meetings, some overlapping, priority conflicts
40
+ - **Agent must:** Prioritize + reschedule + notify all parties
41
+ - **Score:** 34% email + 33% scheduling + 33% conflict
42
+
43
+ ## Environment Design
44
+
45
+ ### Observation Space
46
+ - **Emails:** Sender, subject, body, priority
47
+ - **Calendar:** Existing meetings, working hours, blocked times
48
+ - **Contacts:** Names, emails, timezones
49
+
50
+ ### Action Space
51
+ ```json
52
+ {
53
+ "email_reply": "Professional response text",
54
+ "calendar_action": "book | propose_alternatives | reschedule | decline",
55
+ "meeting_details": {
56
+ "participants": ["email@company.com"],
57
+ "start_time": "2026-04-28T14:00:00",
58
+ "end_time": "2026-04-28T15:00:00",
59
+ "subject": "Meeting topic",
60
+ "proposed_alternatives": [...]
61
+ }
62
+ }
63
+ ```
64
+
65
+ ### Reward Functions (Multiple Independent Checks)
66
+
67
+ **1. Email Quality (0-1)**
68
+ - Politeness markers (thank you, regards)
69
+ - Proper greeting/closing
70
+ - Sufficient detail (20+ words)
71
+ - Professional tone (no negative framing)
72
+ - LLM-as-judge for nuance
73
+
74
+ **2. Scheduling Correctness (0-1)**
75
+ - No double-booking
76
+ - Within working hours
77
+ - Appropriate duration (15min - 2hrs)
78
+ - All participants included
79
+
80
+ **3. Conflict Resolution (0-1)**
81
+ - Recognizes conflicts
82
+ - Proposes 2-3 alternatives
83
+ - Explains professionally
84
+ - Prioritizes correctly (for hard task)
85
+
86
+ **4. Anti-Reward Hacking Penalties**
87
+ - Too short email: -0.3
88
+ - Missing meeting details: -0.4
89
+ - Generic/templated: -0.1
90
+ - Overly long: -0.15
91
+
92
+ ## Baseline Scores
93
+
94
+ ### AI Baseline (Nemotron 3 Super 120B) β€” Untrained
95
+ | Task | Score |
96
+ |------|-------|
97
+ | Easy | 0.315 |
98
+ | Medium | 0.349 |
99
+ | Hard | 0.346 |
100
+ | **Average** | **0.337** |
101
+
102
+ *Note: These are pre-training scores. The model struggles with JSON formatting, conflict detection, and professional email composition. Training target: 0.60-0.80*
103
+
104
+ ## Setup & Usage
105
+
106
+ ### Local Development
107
+
108
+ ```bash
109
+ # Clone the repository
110
+ git clone https://huggingface.co/spaces/YourUsername/exec-assist
111
+ cd exec-assist
112
+
113
+ # Install dependencies
114
+ pip install -r requirements.txt
115
+
116
+ # Run the server
117
+ uvicorn server.app:app --reload
118
+
119
+ # Open API docs
120
+ # http://127.0.0.1:8000/docs
121
+ ```
122
+
123
+ ### Run Baseline Inference
124
+
125
+ ```bash
126
+ # Set environment variables
127
+ export APIBASEURL=https://openrouter.ai/api/v1
128
+ export MODELNAME=nvidia/nemotron-3-super-120b-a12b:free
129
+ export HFTOKEN=your-api-key
130
+
131
+ # Run inference
132
+ python inference.py
133
+ ```
134
+
135
+ ### Docker
136
+
137
+ ```bash
138
+ docker build -t exec-assist .
139
+ docker run -p 7860:7860 exec-assist
140
+ ```
141
+
142
+ ## Training (In Progress β€” Apr 26)
143
+
144
+ We will train using TRL + Unsloth:
145
+ 1. GRPO trainer setup
146
+ 2. Reward shaping
147
+ 3. Baseline comparison
148
+ 4. Before/after examples
149
+
150
+ ## API Endpoints
151
+
152
+ | Endpoint | Method | Description |
153
+ |----------|--------|-------------|
154
+ | `/reset?task=easy\|medium\|hard` | POST | Start new episode |
155
+ | `/step` | POST | Submit action, get reward |
156
+ | `/state` | GET | Current state |
157
+ | `/tasks` | GET | List all tasks |
158
+ | `/health` | GET | Health check |
159
+ | `/metadata` | GET | Environment info |
160
+ | `/schema` | GET | Action/observation/state schemas |
161
+
162
+ ## Author
163
+
164
+ **DevanshuDon** β€” Built for OpenEnv Hackathon 2026