Spaces:

parthpethia
/

Meta-Hackathon

Sleeping

App Files Files Community

Meta-Hackathon / README.md

parthpethia

Add HF Spaces metadata to README

c216bd9 about 1 month ago

preview code

raw

history blame contribute delete

4.42 kB

metadata

title: Email Triage OpenEnv
emoji: 📧
colorFrom: blue
colorTo: green
sdk: docker
port: 7860

Email Triage OpenEnv

A complete, production-ready OpenEnv environment for training AI agents to classify and route emails in real-world triage scenarios.

Overview

Email triage is a genuine operational bottleneck for support teams, content moderators, and business users. This environment challenges agents to:

Classify emails into categories (spam, normal, urgent, billing)
Route to teams based on content and context (support, sales, billing)
Prioritize based on urgency and SLA requirements
Handle complexity across difficulty levels (easy → hard)

The environment provides realistic synthetic email data with varying complexity and meaningful reward signals for partial progress.

Features

✅ Full OpenEnv Spec Compliance: Typed Pydantic models, standard step/reset/state API
✅ 3 Graded Tasks: Easy (spam detection) → Medium (multi-class routing) → Hard (context-aware triage)
✅ Meaningful Reward Function: Partial credit for classification, routing, and priority decisions
✅ Flask REST API: HTTP endpoints for interacting with the environment
✅ Baseline Inference: GPT-4o mini baseline with structured logging
✅ Docker Ready: Single command deployment to Hugging Face Spaces
✅ Synthetic Data: Realistic email generation with metadata and ground truth labels

Quick Start

API Endpoints

The Space provides these endpoints on port 7860:

# Health check
GET /health

# Get available tasks
GET /tasks

# Reset environment for a task
POST /reset?task=spam_detection

# Step the environment with an action
POST /step?task=spam_detection
Content-Type: application/json
{
  "classification": "spam",
  "team": "none",
  "priority": 0
}

# Get current state
GET /state?task=spam_detection

# Describe action/observation spaces
GET /state-describe?task=spam_detection

Tasks

Task 1: Spam Detection (Easy)

Goal: Correctly classify 10 emails as spam or legitimate
Expected Score: ~0.80-0.85
Difficulty: Easy - clear spam patterns

Task 2: Multi-Class Routing (Medium)

Goal: Classify 12 emails into 4 categories and route to correct teams
Expected Score: ~0.70-0.75
Difficulty: Medium - requires multi-class classification and routing

Task 3: Context-Aware Triage (Hard)

Goal: Handle 20 emails with VIP customers, SLAs, and escalations
Expected Score: ~0.60-0.70
Difficulty: Hard - complex context with weighted scoring

Environment Structure

├── environment/
│   ├── env.py           # Main EmailTriageEnv class
│   ├── types.py         # Pydantic models (Observation, Action, Reward)
│   ├── data_generator.py # Synthetic email dataset
│   ├── graders.py       # Task-specific graders
│   └── __init__.py
├── app.py               # Flask REST API
├── inference.py         # Baseline inference script
├── openenv.yaml         # OpenEnv specification
├── Dockerfile           # Docker configuration
├── requirements.txt     # Python dependencies
└── README.md           # This file

Running Locally

# Install dependencies
pip install -r requirements.txt

# Start Flask app
python app.py

# In another terminal, run inference baseline
OPENAI_API_KEY=your_key python inference.py

Deployment

This Space is already deployed on Hugging Face! The Docker image builds automatically from the Dockerfile and serves the Flask API on port 7860.

OpenEnv Specification

This environment fully implements the OpenEnv specification:

Observation Space: Email content, sender info, inbox state
Action Space: Classification (4 categories), Team routing (4 options), Priority (0-3)
Reward Space: Continuous [0.0, 1.0] with breakdown of classification/routing/priority scores
API: reset(), step(action), state() endpoints

Documentation

For more details, see:

START_HERE.md - Getting started guide
DEPLOYMENT_CHECKLIST.md - Pre-submission checklist
VALIDATION_GUIDE.md - Testing and validation
FINAL_VALIDATION_REPORT.md - Full validation results

Status: ✅ Production Ready
OpenEnv Compliance: ✅ 100%
All Tests: ✅ Passing
Ready for Submission: ✅ Yes