Spaces:
Sleeping
Sleeping
File size: 4,423 Bytes
c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 fee8744 c216bd9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 | ---
title: Email Triage OpenEnv
emoji: π§
colorFrom: blue
colorTo: green
sdk: docker
port: 7860
---
# Email Triage OpenEnv
A complete, production-ready OpenEnv environment for training AI agents to classify and route emails in real-world triage scenarios.
## Overview
Email triage is a genuine operational bottleneck for support teams, content moderators, and business users. This environment challenges agents to:
1. **Classify emails** into categories (spam, normal, urgent, billing)
2. **Route to teams** based on content and context (support, sales, billing)
3. **Prioritize** based on urgency and SLA requirements
4. **Handle complexity** across difficulty levels (easy β hard)
The environment provides realistic synthetic email data with varying complexity and meaningful reward signals for partial progress.
## Features
- β
**Full OpenEnv Spec Compliance**: Typed Pydantic models, standard step/reset/state API
- β
**3 Graded Tasks**: Easy (spam detection) β Medium (multi-class routing) β Hard (context-aware triage)
- β
**Meaningful Reward Function**: Partial credit for classification, routing, and priority decisions
- β
**Flask REST API**: HTTP endpoints for interacting with the environment
- β
**Baseline Inference**: GPT-4o mini baseline with structured logging
- β
**Docker Ready**: Single command deployment to Hugging Face Spaces
- β
**Synthetic Data**: Realistic email generation with metadata and ground truth labels
## Quick Start
### API Endpoints
The Space provides these endpoints on port 7860:
```bash
# Health check
GET /health
# Get available tasks
GET /tasks
# Reset environment for a task
POST /reset?task=spam_detection
# Step the environment with an action
POST /step?task=spam_detection
Content-Type: application/json
{
"classification": "spam",
"team": "none",
"priority": 0
}
# Get current state
GET /state?task=spam_detection
# Describe action/observation spaces
GET /state-describe?task=spam_detection
```
## Tasks
### Task 1: Spam Detection (Easy)
- **Goal**: Correctly classify 10 emails as spam or legitimate
- **Expected Score**: ~0.80-0.85
- **Difficulty**: Easy - clear spam patterns
### Task 2: Multi-Class Routing (Medium)
- **Goal**: Classify 12 emails into 4 categories and route to correct teams
- **Expected Score**: ~0.70-0.75
- **Difficulty**: Medium - requires multi-class classification and routing
### Task 3: Context-Aware Triage (Hard)
- **Goal**: Handle 20 emails with VIP customers, SLAs, and escalations
- **Expected Score**: ~0.60-0.70
- **Difficulty**: Hard - complex context with weighted scoring
## Environment Structure
```
βββ environment/
β βββ env.py # Main EmailTriageEnv class
β βββ types.py # Pydantic models (Observation, Action, Reward)
β βββ data_generator.py # Synthetic email dataset
β βββ graders.py # Task-specific graders
β βββ __init__.py
βββ app.py # Flask REST API
βββ inference.py # Baseline inference script
βββ openenv.yaml # OpenEnv specification
βββ Dockerfile # Docker configuration
βββ requirements.txt # Python dependencies
βββ README.md # This file
```
## Running Locally
```bash
# Install dependencies
pip install -r requirements.txt
# Start Flask app
python app.py
# In another terminal, run inference baseline
OPENAI_API_KEY=your_key python inference.py
```
## Deployment
This Space is already deployed on Hugging Face! The Docker image builds automatically from the Dockerfile and serves the Flask API on port 7860.
## OpenEnv Specification
This environment fully implements the OpenEnv specification:
- **Observation Space**: Email content, sender info, inbox state
- **Action Space**: Classification (4 categories), Team routing (4 options), Priority (0-3)
- **Reward Space**: Continuous [0.0, 1.0] with breakdown of classification/routing/priority scores
- **API**: `reset()`, `step(action)`, `state()` endpoints
## Documentation
For more details, see:
- `START_HERE.md` - Getting started guide
- `DEPLOYMENT_CHECKLIST.md` - Pre-submission checklist
- `VALIDATION_GUIDE.md` - Testing and validation
- `FINAL_VALIDATION_REPORT.md` - Full validation results
---
**Status**: β
Production Ready
**OpenEnv Compliance**: β
100%
**All Tests**: β
Passing
**Ready for Submission**: β
Yes
|