| --- |
| title: Support Triage OpenEnv |
| emoji: "📨" |
| colorFrom: blue |
| colorTo: teal |
| sdk: docker |
| app_port: 7860 |
| tags: |
| - openenv |
| - reinforcement-learning |
| - customer-support |
| license: mit |
| --- |
| |
| # Support Triage OpenEnv |
|
|
| A complete, real-world OpenEnv environment for training/evaluating agents on **customer support ticket triage**. The environment simulates what support teams actually do: read inbox tickets, classify urgency/category, draft safe responses, and resolve the right ticket. |
|
|
| ## Why this environment |
|
|
| Most agent benchmarks under-model production support workflows. This environment focuses on practical support operations with: |
| - Multi-ticket inbox context selection |
| - Policy-compliant communication |
| - Priority + escalation decisions |
| - Deterministic graders and dense reward shaping |
|
|
| ## OpenEnv API compliance |
|
|
| The environment exposes: |
| - `reset(task_id?: str) -> Observation` |
| - `step(action: Action) -> (Observation, Reward, done, info)` |
| - `state() -> dict` |
|
|
| Typed Pydantic models: |
| - `Observation`: [`src/support_triage_openenv/models.py`](src/support_triage_openenv/models.py) |
| - `Action`: [`src/support_triage_openenv/models.py`](src/support_triage_openenv/models.py) |
| - `Reward`: [`src/support_triage_openenv/models.py`](src/support_triage_openenv/models.py) |
|
|
| Metadata: |
| - `openenv.yaml` |
|
|
| ## Action space |
|
|
| `Action` model fields: |
| - `action_type`: one of `read_ticket | classify_ticket | draft_reply | resolve_ticket` |
| - `ticket_id`: required for `read_ticket`, `classify_ticket`, `resolve_ticket` |
| - `priority`: optional enum `low | medium | high | urgent` |
| - `category`: optional enum `account | billing | technical | abuse | general` |
| - `needs_escalation`: optional bool |
| - `message`: text for `draft_reply` |
|
|
| ## Observation space |
|
|
| `Observation` includes: |
| - `task_id`, `objective`, `step_count`, `max_steps` |
| - `inbox`: ticket metadata list (`ticket_id`, subject, tier, age, read flag) |
| - `current_ticket_content`: only visible after reading selected ticket |
| - `latest_system_note`: feedback from last step |
| - `score_hint`: partial grader components (`read`, `classify`, `reply`, `resolve`) |
|
|
| ## Tasks and difficulty |
|
|
| 1. `easy_password_reset` (Easy) |
| - Correctly process account lockout and send secure reset guidance. |
|
|
| 2. `medium_billing_dispute` (Medium) |
| - Investigate duplicate billing with context ticket and provide policy-compliant refund timeline. |
|
|
| 3. `hard_outage_incident` (Hard) |
| - Handle a high-stakes outage report requiring multi-ticket context, urgent escalation, and careful incident messaging. |
|
|
| Each task has deterministic grading in `support_triage_openenv.graders.grade_task`, returning a score `0.0-1.0`. |
|
|
| ## Reward design |
|
|
| Reward is shaped and meaningful across the trajectory: |
| - Positive dense signal from partial grader progress (read/context, classification fields, reply quality, resolve correctness) |
| - Penalties for invalid actions, repeated loops, and malformed steps |
| - Final step guarantees score alignment with deterministic grader output |
|
|
| ## Project structure |
|
|
| - `src/support_triage_openenv/env.py` - environment implementation |
| - `src/support_triage_openenv/models.py` - typed OpenEnv models |
| - `src/support_triage_openenv/tasks.py` - task specs (easy/medium/hard) |
| - `src/support_triage_openenv/graders.py` - deterministic grader logic |
| - `scripts/run_baseline.py` - OpenAI baseline inference runner |
| - `scripts/validate_env.py` - tests + optional `openenv validate` |
| - `app.py` - FastAPI app for HF Space runtime |
| - `Dockerfile` - containerized deployment |
|
|
| ## Setup |
|
|
| ```bash |
| cd /home/ai24mtech14005/meta_hackathon |
| python3 -m venv .venv |
| source .venv/bin/activate |
| pip install -r requirements.txt |
| ``` |
|
|
| ## Run tests |
|
|
| ```bash |
| python -m pytest -q |
| ``` |
|
|
| ## Run baseline |
|
|
| OpenAI model baseline: |
|
|
| ```bash |
| export API_BASE_URL=https://your-openai-compatible-endpoint/v1 |
| export MODEL_NAME=your-model-id |
| export HF_TOKEN=your-api-key |
| python inference.py --mode openai --output scores/inference_scores.json |
| ``` |
|
|
| Deterministic heuristic baseline: |
|
|
| ```bash |
| python inference.py --mode heuristic --output scores/inference_scores.json |
| ``` |
|
|
| Outputs JSON report to `scores/inference_scores.json` and structured stdout logs with `[START]`, `[STEP]`, `[END]`. |
|
|
| ## Run API locally |
|
|
| ```bash |
| uvicorn app:app --host 0.0.0.0 --port 7860 |
| ``` |
|
|
| Endpoints: |
| - `GET /health` |
| - `POST /reset` |
| - `POST /step` |
| - `GET /state` |
|
|
| ## Docker |
|
|
| ```bash |
| docker build -t support-triage-openenv . |
| docker run --rm -p 7860:7860 support-triage-openenv |
| ``` |
|
|
| ## Hugging Face Space deployment |
|
|
| - Create a **Docker Space**. |
| - Push this repository to the Space. |
| - Keep `README.md` frontmatter tags including `openenv`. |
| - Space serves the API on port `7860`. |
|
|
| ## One-command remote bootstrap |
|
|
| If you want this local repo to automatically create and push to both GitHub + HF: |
|
|
| ```bash |
| export GITHUB_USERNAME=your_github_user |
| export GITHUB_TOKEN=your_github_pat |
| export HF_USERNAME=your_hf_user |
| export HF_TOKEN=your_hf_token |
| bash scripts/bootstrap_remotes.sh support-triage-openenv |
| ``` |
|
|
| ## Baseline scores (heuristic reproducible) |
|
|
| Generated with: |
|
|
| ```bash |
| python inference.py --mode heuristic --output scores/inference_scores.json |
| ``` |
|
|
| - `easy_password_reset`: grader `1.0`, reward `1.0` |
| - `medium_billing_dispute`: grader `1.0`, reward `1.0` |
| - `hard_outage_incident`: grader `1.0`, reward `1.0` |
| - Overall average grader score: `1.0` |
| - Tracked reference artifact: `baseline_expected_scores.json` |
|
|
| ## Pre-submission validator |
|
|
| Run full strict validation (all disqualification gates): |
|
|
| ```bash |
| python pre_submission_validate.py --space-url https://your-space-name.hf.space |
| ``` |
|
|
| Local-only run while iterating (skips Docker daemon + remote space ping): |
|
|
| ```bash |
| python pre_submission_validate.py --skip-docker --skip-space |
| ``` |
|
|
| Run organizer-provided script directly (integrated path): |
|
|
| ```bash |
| bash scripts/pre_validation_script.sh https://your-space-name.hf.space . |
| ``` |
|
|
| Notes: |
| - `scripts/sample_inference_script.sh` is kept as organizer reference. |
| - Root `inference.py` is aligned to the required `[START]`, `[STEP]`, `[END]` line format. |
|
|