--- title: Email Triage OpenEnv emoji: 📧 colorFrom: blue colorTo: green sdk: docker port: 7860 --- # Email Triage OpenEnv A complete, production-ready OpenEnv environment for training AI agents to classify and route emails in real-world triage scenarios. ## Overview Email triage is a genuine operational bottleneck for support teams, content moderators, and business users. This environment challenges agents to: 1. **Classify emails** into categories (spam, normal, urgent, billing) 2. **Route to teams** based on content and context (support, sales, billing) 3. **Prioritize** based on urgency and SLA requirements 4. **Handle complexity** across difficulty levels (easy → hard) The environment provides realistic synthetic email data with varying complexity and meaningful reward signals for partial progress. ## Features - ✅ **Full OpenEnv Spec Compliance**: Typed Pydantic models, standard step/reset/state API - ✅ **3 Graded Tasks**: Easy (spam detection) → Medium (multi-class routing) → Hard (context-aware triage) - ✅ **Meaningful Reward Function**: Partial credit for classification, routing, and priority decisions - ✅ **Flask REST API**: HTTP endpoints for interacting with the environment - ✅ **Baseline Inference**: GPT-4o mini baseline with structured logging - ✅ **Docker Ready**: Single command deployment to Hugging Face Spaces - ✅ **Synthetic Data**: Realistic email generation with metadata and ground truth labels ## Quick Start ### API Endpoints The Space provides these endpoints on port 7860: ```bash # Health check GET /health # Get available tasks GET /tasks # Reset environment for a task POST /reset?task=spam_detection # Step the environment with an action POST /step?task=spam_detection Content-Type: application/json { "classification": "spam", "team": "none", "priority": 0 } # Get current state GET /state?task=spam_detection # Describe action/observation spaces GET /state-describe?task=spam_detection ``` ## Tasks ### Task 1: Spam Detection (Easy) - **Goal**: Correctly classify 10 emails as spam or legitimate - **Expected Score**: ~0.80-0.85 - **Difficulty**: Easy - clear spam patterns ### Task 2: Multi-Class Routing (Medium) - **Goal**: Classify 12 emails into 4 categories and route to correct teams - **Expected Score**: ~0.70-0.75 - **Difficulty**: Medium - requires multi-class classification and routing ### Task 3: Context-Aware Triage (Hard) - **Goal**: Handle 20 emails with VIP customers, SLAs, and escalations - **Expected Score**: ~0.60-0.70 - **Difficulty**: Hard - complex context with weighted scoring ## Environment Structure ``` ├── environment/ │ ├── env.py # Main EmailTriageEnv class │ ├── types.py # Pydantic models (Observation, Action, Reward) │ ├── data_generator.py # Synthetic email dataset │ ├── graders.py # Task-specific graders │ └── __init__.py ├── app.py # Flask REST API ├── inference.py # Baseline inference script ├── openenv.yaml # OpenEnv specification ├── Dockerfile # Docker configuration ├── requirements.txt # Python dependencies └── README.md # This file ``` ## Running Locally ```bash # Install dependencies pip install -r requirements.txt # Start Flask app python app.py # In another terminal, run inference baseline OPENAI_API_KEY=your_key python inference.py ``` ## Deployment This Space is already deployed on Hugging Face! The Docker image builds automatically from the Dockerfile and serves the Flask API on port 7860. ## OpenEnv Specification This environment fully implements the OpenEnv specification: - **Observation Space**: Email content, sender info, inbox state - **Action Space**: Classification (4 categories), Team routing (4 options), Priority (0-3) - **Reward Space**: Continuous [0.0, 1.0] with breakdown of classification/routing/priority scores - **API**: `reset()`, `step(action)`, `state()` endpoints ## Documentation For more details, see: - `START_HERE.md` - Getting started guide - `DEPLOYMENT_CHECKLIST.md` - Pre-submission checklist - `VALIDATION_GUIDE.md` - Testing and validation - `FINAL_VALIDATION_REPORT.md` - Full validation results --- **Status**: ✅ Production Ready **OpenEnv Compliance**: ✅ 100% **All Tests**: ✅ Passing **Ready for Submission**: ✅ Yes