Gaurav3134's picture
Update README.md
3b27bbc verified
metadata
title: Email Assistant Env Environment Server
emoji: πŸ“§
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
app_port: 8000
base_path: /web
tags:
  - openenv

OpenEnv β€” AI Email Assistant πŸ“§

An OpenEnv-compliant environment (EmailEnv) that empowers AI agents to handle professional email triage and response workflows. This project wraps a production-grade FastAPI application into a Gymnasium-style environment for evaluation and automated grading.


πŸš€ Purpose

The AI Email Assistant is designed to benchmark and evaluate autonomous agents in the domain of email management. It provides a standardized interface for agents to perform tasks ranging from simple spam classification to complex, multi-step customer support resolutions.

πŸ—οΈ Architecture

The project follows a modular architecture separating the core business logic from the OpenEnv environment wrapper and the LLM inference layer:

graph TD
    A[Agent / LLM] -- Action --> B[EmailEnv (OpenEnv)]
    B -- Observations/Rewards --> A
    B -- Internal Calls --> C[EmailAssistantCore]
    C -- AI Logic --> D[AIEngine]
    C -- Persistence --> E[SQLite Memory]
    B -- Task Logic --> F[Task Handlers]
    B -- Scoring --> G[Rubric Graders]
  • EmailEnv (OpenEnv): The core reward and state management layer.
  • EmailAssistantCore: Handles the underlying business functions (fetching, classifying, drafting).
  • FastAPI OpenEnv Interface: Exposes the environment as a REST API for remote evaluation.
  • YourEnv Client: A developer-friendly Python client for local testing and inference.

🌟 Innovative Rubric Design

Our project introduces a Multi-Dimensional Dense Reward Rubric that goes beyond simple "correct/incorrect" binary scoring. The reward is calculated at every step and considers:

  • Classification Precision: Accurate intent detection (Sales vs. Support) is highly rewarded (1.0).
  • Reasoning Quality: Agents are rewarded for providing high-quality reasoning (0.2), encouraging transparency.
  • Tone Compliance: Generating replies in the requested tone (professional, friendly, urgent) earns bonus points (0.3).
  • Workflow Efficiency: A step penalty (-0.01) is applied to encourage efficient resolutions, while completing a task fully grants a large resolution bonus (2.0).
  • Strict Guardrails: Penalties are applied for invalid actions (e.g., trying to send before classifying) and duplicate actions, ensuring safe and reliable agent behavior.

πŸƒ How to Run

Local Development

  1. Environment Setup:

    cd "C:\Users\gaura\OneDrive\Documents\New project\hf-space-email-assistant"
    python -m venv .venv
    .\.venv\Scripts\activate
    python -m pip install -r requirements.txt
    
  2. LLM Key (Gemini Flash):

    • Recommended (Secrets): set LLM_API_KEY (HF Spaces / GitHub / Docker / etc.)
    • Alternative: set GEMINI_API_KEY (or GOOGLE_API_KEY) and GEMINI_MODEL=gemini-1.5-flash
  3. Run Backend (Port 7860):

    uvicorn app.main:app --host 0.0.0.0 --port 7860
    
  4. Run Frontend (Port 3000):

    cd frontend
    npm run dev
    

Evaluation & Inference

To run the full suite of tasks using the deterministic baseline:

python inference.py

To validate the OpenEnv compliance:

.\.venv\Scripts\openenv validate --env email-assistant-env

πŸ“Š Benchmark Tasks

  1. Spam Detection (Easy): Correctly identify malicious or promotional content.
  2. Intent Classification (Medium): Classify inbound customer inquiries with reasoning.
  3. Multi-Step Resolution (Hard): Classify β†’ Request Info (if missing) β†’ Draft Final Reply.

Built with ❀️ for the OpenEnv Hackathon.