parthpethia commited on
Commit
68ccb77
·
verified ·
1 Parent(s): 285e6b6

Create TRAINING_RUN.md

Browse files
Files changed (1) hide show
  1. TRAINING_RUN.md +65 -0
TRAINING_RUN.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Training Run Documentation
2
+
3
+ ## Project Overview
4
+ **Email Triage OpenEnv** is a production-ready OpenEnv environment developed for the Meta x OpenEnv Hackathon. It simulates real-world email triage workflows where AI agents classify, prioritize, and route emails across operational categories such as spam, billing, support, and urgent communications.
5
+
6
+ This project addresses a genuine business bottleneck: automated inbox triage for support teams, moderators, and enterprise workflows.
7
+
8
+ ---
9
+
10
+ ## Framework Used
11
+ - OpenEnv (latest release)
12
+ - Python 3.11
13
+ - Hugging Face Space (Docker deployment)
14
+ - Flask REST API
15
+ - Pydantic typed models
16
+ - GPT-4o mini baseline inference pipeline
17
+ - Custom task graders and synthetic data generation
18
+
19
+ ---
20
+
21
+ ## Objective
22
+ The environment was designed to train and evaluate agents on progressively harder email triage tasks:
23
+
24
+ ### Task 1: Spam Detection (Easy)
25
+ - Binary spam vs normal classification
26
+ - 10 synthetic emails
27
+ - Expected score: 0.80–0.85
28
+
29
+ ### Task 2: Multi-Class Routing (Medium)
30
+ - Classify emails into spam / normal / urgent / billing
31
+ - Route to support / sales / billing / none
32
+ - 12 synthetic emails
33
+ - Expected score: 0.70–0.75
34
+
35
+ ### Task 3: Context-Aware Triage (Hard)
36
+ - VIP customers
37
+ - SLA urgency
38
+ - Escalation handling
39
+ - 20 synthetic emails
40
+ - Expected score: 0.60–0.70
41
+
42
+ ---
43
+
44
+ ## Development & Training Process
45
+ Although this project was implemented as a full production environment rather than a Colab notebook, the complete training, evaluation, and baseline workflow is included in the repository.
46
+
47
+ ### Process:
48
+ 1. Designed synthetic email datasets with realistic metadata
49
+ 2. Built OpenEnv-compliant environment with typed observation/action/reward spaces
50
+ 3. Implemented graded task progression (easy → medium → hard)
51
+ 4. Developed reward functions with partial progress scoring:
52
+ - Classification: 40%
53
+ - Routing: 30%
54
+ - Priority: 30%
55
+ 5. Created GPT-4o mini inference baseline
56
+ 6. Validated all components with comprehensive automated testing
57
+ 7. Deployed to Hugging Face Space
58
+
59
+ ---
60
+
61
+ ## Reward Function
62
+ ```text
63
+ reward = (0.4 * classification_accuracy)
64
+ + (0.3 * routing_accuracy)
65
+ + (0.3 * priority_accuracy)