---
title: SupportEnv
emoji: 🎫
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
tags:
  - openenv
  - customer-support
  - nlp
  - ticket-triage
  - agent-evaluation
pinned: false
---

# SupportEnv

SupportEnv is an OpenEnv-compliant environment for evaluating LLM agents on customer support ticket triage. Each episode presents a realistic support ticket and asks the agent to classify, extract, or resolve it — scored deterministically against ground-truth labels.

## Tasks

| Task | Difficulty | Action | Max Steps |
|------|-----------|--------|-----------|
| Task 1 — Ticket Classification | Easy | `classify` | 3 |
| Task 2 — Information Extraction | Medium | `extract` | 5 |
| Task 3 — Resolution Generation | Hard | `respond` | 8 |

**Task 1 — Ticket Classification (Easy)**  
Assign a `category` (billing / technical / account / feature_request / complaint / general) and `priority` (low / medium / high / critical) to each ticket.

**Task 2 — Information Extraction (Medium)**  
Extract structured entities (IDs, names, amounts, dates) and identify the list of required resolution actions.

**Task 3 — Resolution Generation (Hard)**  
Write a professional customer-facing response and an ordered list of internal resolution steps. Graded on keyword coverage, step completeness, tone adherence, and minimum length.

## Observation Space

Each observation includes:

- `task_id`, `task_description`, `episode_id`
- `ticket` object with `ticket_id`, `subject`, `body`, `customer_tier`, `account_age_days`, `previous_tickets`, `attachments`
- `thread_history` as ordered action summaries
- `available_actions` for the current task state
- `step_number`, `max_steps`
- `hint` (optional guidance)

## Action Space

Supported `action.action_type` values:

- `classify`: requires `category` and `priority`
- `extract`: requires `extracted_entities` and `required_actions`
- `respond`: requires `response_text` and `resolution_steps`
- `submit`: closes the episode and triggers terminal grading

## API

| Method | Endpoint | Description |
|--------|----------|-------------|
| `POST` | `/reset` | Start a new episode |
| `POST` | `/step` | Submit an action |
| `GET` | `/state` | Get current episode state |
| `POST` | `/grader` | Grade a finished episode |
| `GET` | `/tasks` | List all tasks |
| `GET` | `/health` | Liveness check |
| `GET` | `/docs` | OpenAPI docs |

### Reset
```json
POST /reset
{"task_id": "task1", "ticket_index": 0}
```

### Step — Task 1 (classify)
```json
POST /step
{
  "episode_id": "<id>",
  "action": {"action_type": "classify", "category": "billing", "priority": "high"}
}
```

### Step — Task 2 (extract)
```json
POST /step
{
  "episode_id": "<id>",
  "action": {
    "action_type": "extract",
    "extracted_entities": {"customer_name": "Alice", "invoice_number": "INV-001"},
    "required_actions": ["issue_refund", "send_corrected_invoice"]
  }
}
```

### Step — Task 3 (respond)
```json
POST /step
{
  "episode_id": "<id>",
  "action": {
    "action_type": "respond",
    "response_text": "Dear customer, we sincerely apologize...",
    "resolution_steps": ["verify_account", "issue_refund", "send_confirmation"]
  }
}
```

### Submit
```json
POST /step
{"episode_id": "<id>", "action": {"action_type": "submit"}}
```

## Scoring

**Task 1:** category match (0.50) + priority match (0.40) + efficiency (0.10)

**Task 2:** entity coverage (0.60) + action coverage (0.30) + no hallucination (0.10)

**Task 3:** keyword coverage (0.30) + step coverage (0.30) + tone compliance (0.25) + length adequate (0.10) + non-empty steps (0.05)

## Running Locally

```bash
pip install -r requirements.txt
uvicorn app:app --host 0.0.0.0 --port 7860
```

## Running the Baseline Agent

```bash
export API_BASE_URL=https://router.huggingface.co/v1
export HF_TOKEN=your_token_here
export MODEL_NAME=Qwen/Qwen2.5-72B-Instruct
python inference.py
```

Required environment variables for baseline LLM calls:

- `API_BASE_URL` (default provided in code)
- `MODEL_NAME` (default provided in code)
- `HF_TOKEN` (must be provided)

Environment endpoint variables for the baseline:

- `OPENENV_BASE_URL` (preferred, default `http://localhost:7860`)
- `API_BASE_URL_ENV` (backward-compatible alias)

The baseline emits strict structured stdout lines only:

- `[START] task=<...> env=<...> model=<...>`
- `[STEP] step=<...> action=<...> reward=<...> done=<...> error=<...>`
- `[END] success=<...> steps=<...> rewards=<...>`

## Docker

```bash
docker build -t supportenv .
docker run -p 7860:7860 supportenv
```