Spaces:

parthpethia
/

Meta-Hackathon

Sleeping

App Files Files Community

parthpethia commited on 27 days ago

Commit

68ccb77

verified ·

1 Parent(s): 285e6b6

Create TRAINING_RUN.md

Browse files

Files changed (1) hide show

TRAINING_RUN.md +65 -0

TRAINING_RUN.md ADDED Viewed

	@@ -0,0 +1,65 @@

+# Training Run Documentation
+## Project Overview
+**Email Triage OpenEnv** is a production-ready OpenEnv environment developed for the Meta x OpenEnv Hackathon. It simulates real-world email triage workflows where AI agents classify, prioritize, and route emails across operational categories such as spam, billing, support, and urgent communications.
+This project addresses a genuine business bottleneck: automated inbox triage for support teams, moderators, and enterprise workflows.
+---
+## Framework Used
+- OpenEnv (latest release)
+- Python 3.11
+- Hugging Face Space (Docker deployment)
+- Flask REST API
+- Pydantic typed models
+- GPT-4o mini baseline inference pipeline
+- Custom task graders and synthetic data generation
+---
+## Objective
+The environment was designed to train and evaluate agents on progressively harder email triage tasks:
+### Task 1: Spam Detection (Easy)
+- Binary spam vs normal classification
+- 10 synthetic emails
+- Expected score: 0.80–0.85
+### Task 2: Multi-Class Routing (Medium)
+- Classify emails into spam / normal / urgent / billing
+- Route to support / sales / billing / none
+- 12 synthetic emails
+- Expected score: 0.70–0.75
+### Task 3: Context-Aware Triage (Hard)
+- VIP customers
+- SLA urgency
+- Escalation handling
+- 20 synthetic emails
+- Expected score: 0.60–0.70
+---
+## Development & Training Process
+Although this project was implemented as a full production environment rather than a Colab notebook, the complete training, evaluation, and baseline workflow is included in the repository.
+### Process:
+1. Designed synthetic email datasets with realistic metadata
+2. Built OpenEnv-compliant environment with typed observation/action/reward spaces
+3. Implemented graded task progression (easy → medium → hard)
+4. Developed reward functions with partial progress scoring:
+   - Classification: 40%
+   - Routing: 30%
+   - Priority: 30%
+5. Created GPT-4o mini inference baseline
+6. Validated all components with comprehensive automated testing
+7. Deployed to Hugging Face Space
+---
+## Reward Function
+```text
+reward = (0.4 * classification_accuracy)
+       + (0.3 * routing_accuracy)
+       + (0.3 * priority_accuracy)