File size: 6,000 Bytes
1395b2e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 | ---
title: Support Triage OpenEnv
emoji: "📨"
colorFrom: blue
colorTo: teal
sdk: docker
app_port: 7860
tags:
- openenv
- reinforcement-learning
- customer-support
license: mit
---
# Support Triage OpenEnv
A complete, real-world OpenEnv environment for training/evaluating agents on **customer support ticket triage**. The environment simulates what support teams actually do: read inbox tickets, classify urgency/category, draft safe responses, and resolve the right ticket.
## Why this environment
Most agent benchmarks under-model production support workflows. This environment focuses on practical support operations with:
- Multi-ticket inbox context selection
- Policy-compliant communication
- Priority + escalation decisions
- Deterministic graders and dense reward shaping
## OpenEnv API compliance
The environment exposes:
- `reset(task_id?: str) -> Observation`
- `step(action: Action) -> (Observation, Reward, done, info)`
- `state() -> dict`
Typed Pydantic models:
- `Observation`: [`src/support_triage_openenv/models.py`](src/support_triage_openenv/models.py)
- `Action`: [`src/support_triage_openenv/models.py`](src/support_triage_openenv/models.py)
- `Reward`: [`src/support_triage_openenv/models.py`](src/support_triage_openenv/models.py)
Metadata:
- `openenv.yaml`
## Action space
`Action` model fields:
- `action_type`: one of `read_ticket | classify_ticket | draft_reply | resolve_ticket`
- `ticket_id`: required for `read_ticket`, `classify_ticket`, `resolve_ticket`
- `priority`: optional enum `low | medium | high | urgent`
- `category`: optional enum `account | billing | technical | abuse | general`
- `needs_escalation`: optional bool
- `message`: text for `draft_reply`
## Observation space
`Observation` includes:
- `task_id`, `objective`, `step_count`, `max_steps`
- `inbox`: ticket metadata list (`ticket_id`, subject, tier, age, read flag)
- `current_ticket_content`: only visible after reading selected ticket
- `latest_system_note`: feedback from last step
- `score_hint`: partial grader components (`read`, `classify`, `reply`, `resolve`)
## Tasks and difficulty
1. `easy_password_reset` (Easy)
- Correctly process account lockout and send secure reset guidance.
2. `medium_billing_dispute` (Medium)
- Investigate duplicate billing with context ticket and provide policy-compliant refund timeline.
3. `hard_outage_incident` (Hard)
- Handle a high-stakes outage report requiring multi-ticket context, urgent escalation, and careful incident messaging.
Each task has deterministic grading in `support_triage_openenv.graders.grade_task`, returning a score `0.0-1.0`.
## Reward design
Reward is shaped and meaningful across the trajectory:
- Positive dense signal from partial grader progress (read/context, classification fields, reply quality, resolve correctness)
- Penalties for invalid actions, repeated loops, and malformed steps
- Final step guarantees score alignment with deterministic grader output
## Project structure
- `src/support_triage_openenv/env.py` - environment implementation
- `src/support_triage_openenv/models.py` - typed OpenEnv models
- `src/support_triage_openenv/tasks.py` - task specs (easy/medium/hard)
- `src/support_triage_openenv/graders.py` - deterministic grader logic
- `scripts/run_baseline.py` - OpenAI baseline inference runner
- `scripts/validate_env.py` - tests + optional `openenv validate`
- `app.py` - FastAPI app for HF Space runtime
- `Dockerfile` - containerized deployment
## Setup
```bash
cd /home/ai24mtech14005/meta_hackathon
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```
## Run tests
```bash
python -m pytest -q
```
## Run baseline
OpenAI model baseline:
```bash
export API_BASE_URL=https://your-openai-compatible-endpoint/v1
export MODEL_NAME=your-model-id
export HF_TOKEN=your-api-key
python inference.py --mode openai --output scores/inference_scores.json
```
Deterministic heuristic baseline:
```bash
python inference.py --mode heuristic --output scores/inference_scores.json
```
Outputs JSON report to `scores/inference_scores.json` and structured stdout logs with `[START]`, `[STEP]`, `[END]`.
## Run API locally
```bash
uvicorn app:app --host 0.0.0.0 --port 7860
```
Endpoints:
- `GET /health`
- `POST /reset`
- `POST /step`
- `GET /state`
## Docker
```bash
docker build -t support-triage-openenv .
docker run --rm -p 7860:7860 support-triage-openenv
```
## Hugging Face Space deployment
- Create a **Docker Space**.
- Push this repository to the Space.
- Keep `README.md` frontmatter tags including `openenv`.
- Space serves the API on port `7860`.
## One-command remote bootstrap
If you want this local repo to automatically create and push to both GitHub + HF:
```bash
export GITHUB_USERNAME=your_github_user
export GITHUB_TOKEN=your_github_pat
export HF_USERNAME=your_hf_user
export HF_TOKEN=your_hf_token
bash scripts/bootstrap_remotes.sh support-triage-openenv
```
## Baseline scores (heuristic reproducible)
Generated with:
```bash
python inference.py --mode heuristic --output scores/inference_scores.json
```
- `easy_password_reset`: grader `1.0`, reward `1.0`
- `medium_billing_dispute`: grader `1.0`, reward `1.0`
- `hard_outage_incident`: grader `1.0`, reward `1.0`
- Overall average grader score: `1.0`
- Tracked reference artifact: `baseline_expected_scores.json`
## Pre-submission validator
Run full strict validation (all disqualification gates):
```bash
python pre_submission_validate.py --space-url https://your-space-name.hf.space
```
Local-only run while iterating (skips Docker daemon + remote space ping):
```bash
python pre_submission_validate.py --skip-docker --skip-space
```
Run organizer-provided script directly (integrated path):
```bash
bash scripts/pre_validation_script.sh https://your-space-name.hf.space .
```
Notes:
- `scripts/sample_inference_script.sh` is kept as organizer reference.
- Root `inference.py` is aligned to the required `[START]`, `[STEP]`, `[END]` line format.
|