SushCodex's picture
Upload 14 files
fc163a0 verified

OpenEnv: Email Triage & Scheduling Assistant (EmailEnv-v1) πŸ“§πŸš€

EmailTriage-v1 is a high-utility, real-world task simulation designed for evaluating the decision-making and logical reasoning of agentic workflows. This environment bridges the gap between toy grid-worlds and actual professional productivity tasks.

🌟 Motivation & Real-World Utility (30% Weight)

Manual email management is a labor-intensive professional task. This environment models the Email Triage Assistant role, a critical function in modern digital workflows. Agents are evaluated on their ability to:

  • Prioritize: Distinguish between high-stakes meeting requests and low-value noise.
  • Categorize: Maintain a structured workspace by sorting multi-topic communications.
  • Coordinate: Resolve scheduling conflicts using real-time calendar cross-referencingβ€”a task that requires logical deduction and conflict resolution.

πŸ—οΈ Environment Design (20% Weight)

Observation Space (Pydantic Typed)

The agent receives a rich state snapshot including:

  • inbox_count: Real-time counter of unprocessed items.
  • current_email: A structured object containing the sender, subject, body, and priority.
  • calendar: A list of events representing the agent's current "busy" times.

Action Space (Pydantic Typed)

The agent can interact with the environment via four high-level professional actions:

  • MOVE: Relocate emails to folders (Archive, Work, Social, Spam).
  • DELETE: Permanent removal of high-risk items (Spam).
  • REPLY/SCHEDULE: Contextual interactions that require generating appropriate reply text (e.g., confirming a 2 PM slot).

πŸ“Š Task Difficulty Progression (25% Weight)

Task ID Level Objective Grader / Success Criteria
1: Spam Guard Easy Identify and archive a clear spam email ($1M claims). Successfully move the spam ID to the "Spam" folder.
2: Inbox Zero Medium Categorize a mixture of work and social updates. Correctly sort all items without misplacing a priority email.
3: Coordinator Hard Schedule a new 2 PM meeting while avoiding a 10 AM conflict. Generate a reply correctly confirming the non-conflicting time.

πŸš€ Setup & Usage

  1. API Key: Set your OPENAI_API_KEY in your environment.
  2. Launch Server: python main.py
  3. Run Baseline: python inference.py

License: MIT