Meta-Hackathon / README.md
parthpethia's picture
Add HF Spaces metadata to README
c216bd9
---
title: Email Triage OpenEnv
emoji: πŸ“§
colorFrom: blue
colorTo: green
sdk: docker
port: 7860
---
# Email Triage OpenEnv
A complete, production-ready OpenEnv environment for training AI agents to classify and route emails in real-world triage scenarios.
## Overview
Email triage is a genuine operational bottleneck for support teams, content moderators, and business users. This environment challenges agents to:
1. **Classify emails** into categories (spam, normal, urgent, billing)
2. **Route to teams** based on content and context (support, sales, billing)
3. **Prioritize** based on urgency and SLA requirements
4. **Handle complexity** across difficulty levels (easy β†’ hard)
The environment provides realistic synthetic email data with varying complexity and meaningful reward signals for partial progress.
## Features
- βœ… **Full OpenEnv Spec Compliance**: Typed Pydantic models, standard step/reset/state API
- βœ… **3 Graded Tasks**: Easy (spam detection) β†’ Medium (multi-class routing) β†’ Hard (context-aware triage)
- βœ… **Meaningful Reward Function**: Partial credit for classification, routing, and priority decisions
- βœ… **Flask REST API**: HTTP endpoints for interacting with the environment
- βœ… **Baseline Inference**: GPT-4o mini baseline with structured logging
- βœ… **Docker Ready**: Single command deployment to Hugging Face Spaces
- βœ… **Synthetic Data**: Realistic email generation with metadata and ground truth labels
## Quick Start
### API Endpoints
The Space provides these endpoints on port 7860:
```bash
# Health check
GET /health
# Get available tasks
GET /tasks
# Reset environment for a task
POST /reset?task=spam_detection
# Step the environment with an action
POST /step?task=spam_detection
Content-Type: application/json
{
"classification": "spam",
"team": "none",
"priority": 0
}
# Get current state
GET /state?task=spam_detection
# Describe action/observation spaces
GET /state-describe?task=spam_detection
```
## Tasks
### Task 1: Spam Detection (Easy)
- **Goal**: Correctly classify 10 emails as spam or legitimate
- **Expected Score**: ~0.80-0.85
- **Difficulty**: Easy - clear spam patterns
### Task 2: Multi-Class Routing (Medium)
- **Goal**: Classify 12 emails into 4 categories and route to correct teams
- **Expected Score**: ~0.70-0.75
- **Difficulty**: Medium - requires multi-class classification and routing
### Task 3: Context-Aware Triage (Hard)
- **Goal**: Handle 20 emails with VIP customers, SLAs, and escalations
- **Expected Score**: ~0.60-0.70
- **Difficulty**: Hard - complex context with weighted scoring
## Environment Structure
```
β”œβ”€β”€ environment/
β”‚ β”œβ”€β”€ env.py # Main EmailTriageEnv class
β”‚ β”œβ”€β”€ types.py # Pydantic models (Observation, Action, Reward)
β”‚ β”œβ”€β”€ data_generator.py # Synthetic email dataset
β”‚ β”œβ”€β”€ graders.py # Task-specific graders
β”‚ └── __init__.py
β”œβ”€β”€ app.py # Flask REST API
β”œβ”€β”€ inference.py # Baseline inference script
β”œβ”€β”€ openenv.yaml # OpenEnv specification
β”œβ”€β”€ Dockerfile # Docker configuration
β”œβ”€β”€ requirements.txt # Python dependencies
└── README.md # This file
```
## Running Locally
```bash
# Install dependencies
pip install -r requirements.txt
# Start Flask app
python app.py
# In another terminal, run inference baseline
OPENAI_API_KEY=your_key python inference.py
```
## Deployment
This Space is already deployed on Hugging Face! The Docker image builds automatically from the Dockerfile and serves the Flask API on port 7860.
## OpenEnv Specification
This environment fully implements the OpenEnv specification:
- **Observation Space**: Email content, sender info, inbox state
- **Action Space**: Classification (4 categories), Team routing (4 options), Priority (0-3)
- **Reward Space**: Continuous [0.0, 1.0] with breakdown of classification/routing/priority scores
- **API**: `reset()`, `step(action)`, `state()` endpoints
## Documentation
For more details, see:
- `START_HERE.md` - Getting started guide
- `DEPLOYMENT_CHECKLIST.md` - Pre-submission checklist
- `VALIDATION_GUIDE.md` - Testing and validation
- `FINAL_VALIDATION_REPORT.md` - Full validation results
---
**Status**: βœ… Production Ready
**OpenEnv Compliance**: βœ… 100%
**All Tests**: βœ… Passing
**Ready for Submission**: βœ… Yes