Spaces:

mathi3046
/

customer-support-env

Sleeping

App Files Files Community

customer-support-env / README.md

mathi3046

Update URLs and add validate script

b079cfc about 2 months ago

preview code

raw

history blame contribute delete

9.79 kB

metadata

title: Customer Support Env
emoji: 🎧
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
tags:
  - openenv
pinned: false

🎧 AI-Powered Customer Support Ticket Resolution Environment

An OpenEnv-compatible environment for training AI agents to handle real-world customer support scenarios — from simple FAQs to complex, multi-step escalations with angry customers.

1. Environment Overview

This environment simulates a real customer support helpdesk where an AI agent must:

Read incoming customer tickets with varying complexity
Understand customer sentiment (neutral → frustrated → angry)
Apply company policies (refund, shipping, escalation)
Craft professional, empathetic, and accurate responses
Resolve issues within a limited number of steps

The agent interacts using the standard OpenEnv API: reset(), step(), and state().

2. Real-World Use Case

Customer support is one of the most common AI deployment targets. This environment captures realistic challenges:

Challenge	How It's Simulated
Tone matching	Grader evaluates empathy, professionalism, and harmful language
Policy reasoning	Agent must apply correct refund/shipping/escalation policies
Multi-turn dialogue	Customers send follow-up messages that depend on agent's response quality
Escalation handling	Hard tasks require knowing when and how to escalate
Angry customers	Sentiment ranges from neutral to furious, requiring different strategies

3. Action Space

The agent sends a SupportAction with:

class SupportAction(BaseModel):
    response_text: str    # Agent's response to the customer (1-2000 chars)
    action_type: str      # "respond" | "escalate" | "resolve" | "request_info"
    internal_notes: str   # Optional internal notes (not visible to customer)

Action Type	Effect
`respond`	Continue the conversation
`resolve`	Mark ticket as resolved (ends episode)
`escalate`	Escalate to senior support
`request_info`	Ask customer for more information

4. Observation Space

After each step, the agent receives a SupportObservation:

class SupportObservation(BaseModel):
    ticket: TicketInfo              # Ticket metadata (ID, category, priority, customer info)
    conversation_history: list      # Full message history
    current_message: str            # Latest customer message to respond to
    policy_context: str             # Relevant company policies
    task_id: str                    # Current task identifier
    difficulty: str                 # "easy" | "medium" | "hard"
    max_steps: int                  # Maximum steps allowed
    steps_remaining: int            # Steps left before timeout
    done: bool                      # Whether episode is complete
    reward: float                   # Cumulative reward so far

5. Reward Design

The reward function uses a dense, multi-axis scoring system:

Scoring Axes

Axis	Weight (varies by task)	What It Measures
Correctness	0.30-0.35	Keyword/concept matching against expected response elements
Tone	0.30-0.40	Professional, empathetic language vs. harmful/rude signals
Completeness	0.30-0.40	Checklist of required response components

Reward Breakdown Example

+0.30 → Correctly identifies the issue (correctness)
+0.30 → Professional and empathetic tone (tone)
+0.40 → Addresses all required elements (completeness)
─────
 1.00 → Perfect score

Penalties (deducted from total)

Penalty	Deduction	Trigger
Empty response	-0.30	< 5 words
Repeated response	-0.15 to -0.30	Copy-paste from previous
Harmful language	-0.50	Offensive or inappropriate content
Irrelevant content	-0.40	Off-topic responses

6. Task Descriptions

Task 1: Simple FAQ (Easy)

Ticket: "Where is my order?"
Customer: Sarah Johnson (Neutral sentiment)
Expected: Reference order ID, explain shipping timeframe (5-7 business days), mention tracking email
Max Steps: 3
Policy Context: Shipping policy

Task 2: Conditional Refund (Medium)

Ticket: "Refund for opened laptop bag with defective stitching"
Customer: Michael Chen (Frustrated sentiment)
Expected: Identify as manufacturing defect, offer full refund + replacement option, explain return process
Max Steps: 5
Policy Context: Refund policy + Return policy
Follow-ups: Customer provides photos, asks about timeline

Task 3: Complex Complaint Escalation (Hard)

Ticket: "Wrong item, late delivery, rude staff"
Customer: David Martinez (Angry sentiment)
Expected: Address ALL three issues, offer refund + compensation, escalate to manager, provide written confirmation
Max Steps: 7
Policy Context: All policies (refund, return, shipping, escalation)
Follow-ups: Threats to file complaints, demands for specifics, requests for written confirmation

7. Setup Instructions

Prerequisites

Python 3.10+
Docker (optional, for containerized deployment)

Local Setup

# Clone the repository
git clone https://github.com/MathiyazhaganNTL/openenv_scaler.git
cd openenv

# Install dependencies
pip install -r requirements.txt

# Run validation
python validate.py

Environment Variables (for inference)

cp .env.example .env
# Edit .env with your API keys

Variable	Default	Description
`API_BASE_URL`	`https://api.openai.com/v1`	LLM API endpoint
`MODEL_NAME`	`gpt-3.5-turbo`	Model to use
`OPENAI_API_KEY`	—	API key
`HF_TOKEN`	—	Alternative: HF token
`ENV_BASE_URL`	`http://localhost:8000`	Environment server URL

8. Run Instructions

Start the Environment Server

# Direct
python -m server.app

# Or with uvicorn
uvicorn server.app:app --host 0.0.0.0 --port 8000

# Or with Docker
docker build -t customer-support-env .
docker run -p 8000:8000 customer-support-env

Run Baseline Inference

# Start the server first (in another terminal)
uvicorn server.app:app --host 0.0.0.0 --port 8000

# Run inference
python inference.py

API Usage Examples

# Health check
curl http://localhost:8000/health

# List tasks
curl http://localhost:8000/tasks

# Reset environment
curl -X POST http://localhost:8000/reset \
  -H "Content-Type: application/json" \
  -d '{"task_id": "easy_faq"}'

# Step
curl -X POST http://localhost:8000/step \
  -H "Content-Type: application/json" \
  -d '{"action": {"response_text": "Thank you for reaching out!", "action_type": "respond"}}'

# Get state
curl http://localhost:8000/state

Python Client Usage

from server.environment import CustomerSupportEnvironment
from models import SupportAction

env = CustomerSupportEnvironment()

# Reset to a task
obs = env.reset(task_id="easy_faq")
print(obs.current_message)  # Customer's first message

# Respond
action = SupportAction(
    response_text="Hi Sarah! Your order ORD-55821 ships in 5-7 business days...",
    action_type="respond",
)
obs, reward, done, info = env.step(action)
print(f"Reward: {reward:.4f}")
print(f"Score breakdown: {info['reward_breakdown']}")

9. Baseline Results

Running the baseline inference with gpt-3.5-turbo:

Task	Difficulty	Avg Reward	Steps
`easy_faq`	Easy	~0.65	1–2
`medium_refund`	Medium	~0.55	3–4
`hard_escalation`	Hard	~0.45	4–6
Overall	—	~0.55	—

Scores vary based on model quality. Better models achieve higher scores by producing more empathetic, accurate, and complete responses.

Project Structure

openenv/
├── openenv.yaml           # OpenEnv manifest (metadata, tasks, config)
├── models.py              # Pydantic models (Action, Observation, State, Reward)
├── tasks.py               # Task definitions (3 tasks, rubrics, policies)
├── grader.py              # Deterministic grading engine
├── inference.py           # Baseline LLM inference script
├── validate.py            # Environment validation script
├── requirements.txt       # Python dependencies
├── pyproject.toml         # Project configuration
├── Dockerfile             # Docker container definition
├── .dockerignore          # Docker build exclusions
├── .env.example           # Environment variable template
├── .gitignore             # Git ignore rules
├── README.md              # This file
└── server/
    ├── __init__.py
    ├── environment.py     # Core environment (reset/step/state)
    └── app.py             # FastAPI HTTP server

HuggingFace Spaces Deployment

This environment is designed for deployment as a Docker-based HuggingFace Space:

Create a new Space with Docker SDK
Push the code to the Space repository
The Space will auto-build and expose the API at port 8000
Tag the Space with openenv

# Using openenv CLI
openenv push --repo-id mathi3046/customer-support-env

The API endpoint POST /reset will respond with HTTP 200, confirming the Space is operational.

License

MIT License. See LICENSE for details.