---
title: Email Triage OpenEnv
emoji: 📧
colorFrom: blue
colorTo: green
sdk: docker
port: 7860
---

# Email Triage OpenEnv

A complete, production-ready OpenEnv environment for training AI agents to classify and route emails in real-world triage scenarios.

## Overview

Email triage is a genuine operational bottleneck for support teams, content moderators, and business users. This environment challenges agents to:

1. **Classify emails** into categories (spam, normal, urgent, billing)
2. **Route to teams** based on content and context (support, sales, billing)
3. **Prioritize** based on urgency and SLA requirements
4. **Handle complexity** across difficulty levels (easy → hard)

The environment provides realistic synthetic email data with varying complexity and meaningful reward signals for partial progress.

## Features

- ✅ **Full OpenEnv Spec Compliance**: Typed Pydantic models, standard step/reset/state API
- ✅ **3 Graded Tasks**: Easy (spam detection) → Medium (multi-class routing) → Hard (context-aware triage)
- ✅ **Meaningful Reward Function**: Partial credit for classification, routing, and priority decisions
- ✅ **Flask REST API**: HTTP endpoints for interacting with the environment
- ✅ **Baseline Inference**: GPT-4o mini baseline with structured logging
- ✅ **Docker Ready**: Single command deployment to Hugging Face Spaces
- ✅ **Synthetic Data**: Realistic email generation with metadata and ground truth labels

## Quick Start

### API Endpoints

The Space provides these endpoints on port 7860:

```bash
# Health check
GET /health

# Get available tasks
GET /tasks

# Reset environment for a task
POST /reset?task=spam_detection

# Step the environment with an action
POST /step?task=spam_detection
Content-Type: application/json
{
  "classification": "spam",
  "team": "none",
  "priority": 0
}

# Get current state
GET /state?task=spam_detection

# Describe action/observation spaces
GET /state-describe?task=spam_detection
```

## Tasks

### Task 1: Spam Detection (Easy)
- **Goal**: Correctly classify 10 emails as spam or legitimate
- **Expected Score**: ~0.80-0.85
- **Difficulty**: Easy - clear spam patterns

### Task 2: Multi-Class Routing (Medium)
- **Goal**: Classify 12 emails into 4 categories and route to correct teams
- **Expected Score**: ~0.70-0.75
- **Difficulty**: Medium - requires multi-class classification and routing

### Task 3: Context-Aware Triage (Hard)
- **Goal**: Handle 20 emails with VIP customers, SLAs, and escalations
- **Expected Score**: ~0.60-0.70
- **Difficulty**: Hard - complex context with weighted scoring

## Environment Structure

```
├── environment/
│   ├── env.py           # Main EmailTriageEnv class
│   ├── types.py         # Pydantic models (Observation, Action, Reward)
│   ├── data_generator.py # Synthetic email dataset
│   ├── graders.py       # Task-specific graders
│   └── __init__.py
├── app.py               # Flask REST API
├── inference.py         # Baseline inference script
├── openenv.yaml         # OpenEnv specification
├── Dockerfile           # Docker configuration
├── requirements.txt     # Python dependencies
└── README.md           # This file
```

## Running Locally

```bash
# Install dependencies
pip install -r requirements.txt

# Start Flask app
python app.py

# In another terminal, run inference baseline
OPENAI_API_KEY=your_key python inference.py
```

## Deployment

This Space is already deployed on Hugging Face! The Docker image builds automatically from the Dockerfile and serves the Flask API on port 7860.

## OpenEnv Specification

This environment fully implements the OpenEnv specification:

- **Observation Space**: Email content, sender info, inbox state
- **Action Space**: Classification (4 categories), Team routing (4 options), Priority (0-3)
- **Reward Space**: Continuous [0.0, 1.0] with breakdown of classification/routing/priority scores
- **API**: `reset()`, `step(action)`, `state()` endpoints

## Documentation

For more details, see:
- `START_HERE.md` - Getting started guide
- `DEPLOYMENT_CHECKLIST.md` - Pre-submission checklist
- `VALIDATION_GUIDE.md` - Testing and validation
- `FINAL_VALIDATION_REPORT.md` - Full validation results

---

**Status**: ✅ Production Ready  
**OpenEnv Compliance**: ✅ 100%  
**All Tests**: ✅ Passing  
**Ready for Submission**: ✅ Yes