Spaces:
Sleeping
Sleeping
| # Exploit Analysis — 911 Dispatch Supervisor | |
| ## Known Attack Vectors Considered & Closed | |
| ### 1. Reward Farming via Repeated Dispatch-Cancel Cycles | |
| **Vector:** Agent dispatches a unit, immediately cancels, re-dispatches to collect | |
| partial response_time reward on each cycle without ever resolving incidents. | |
| **Mitigation:** Cancel actions return the unit to AVAILABLE but do NOT reset the | |
| incident survival clock. The incident continues counting down regardless of agent | |
| action, so farming cancel-dispatch cycles accelerates incident escalation and | |
| triggers the Safety Gate, collapsing the score to ≤0.2. | |
| ### 2. Safety Gate Bypass via P2-Only Dispatching | |
| **Vector:** Agent ignores all P1 incidents and only resolves P2/P3 incidents to | |
| accumulate triage and response_time rewards without triggering the Safety Gate. | |
| **Mitigation:** The Safety Gate activates if ANY P1 incident existed during the | |
| episode and its survival score is 0.0. The agent cannot avoid P1 incidents | |
| existing — they are spawned deterministically by the scenario fixture. | |
| ### 3. Coverage Score Farming via Staging | |
| **Vector:** Agent repeatedly stages all units in one district to maximize coverage | |
| score for that district while ignoring active incidents. | |
| **Mitigation:** Coverage score is computed across ALL districts simultaneously. | |
| Concentrating units in one district reduces coverage elsewhere, and staged units | |
| cannot respond to incidents without an explicit dispatch action, allowing incident | |
| survival clocks to expire and triggering escalation penalties. | |
| ### 4. Phraseology Score Inflation via Notes Stuffing | |
| **Vector:** Agent fills the notes field with every possible dispatch phrase to | |
| maximize token overlap with canonical phrases. | |
| **Mitigation:** PhraseologyJudge uses token overlap normalized by notes length. | |
| Stuffing long notes with irrelevant text reduces precision, keeping the score low. | |
| Only notes that match the specific action type and incident type score highly. | |
| ### 5. Determinism Exploitation | |
| **Vector:** Agent memorizes the exact incident sequence (seed=42) and hardcodes | |
| optimal actions rather than learning dispatch reasoning. | |
| **Mitigation:** This is intentional for reproducibility. However, the wave spawn | |
| system introduces timing-dependent incident locations with small perturbations, | |
| meaning hardcoded action sequences fail when unit positions vary. The environment | |
| is designed for evaluation, not training-time generalization. | |
| ## Conclusion | |
| No reward farming exploit was found that allows an agent to score >0.6 without | |
| genuinely resolving Priority-1 incidents with correct unit types within survival | |
| clock windows. | |