nihalaninihal Claude Opus 4.6 commited on
Commit
dc8bc66
·
1 Parent(s): 707377e

Refine build plan with devil's advocate corrections

Browse files

- Switch to MCPEnvironment base class (auto MCP tool routing)
- Cut MCP-X gateway (stretch goal only)
- Use _step_impl() instead of step() for game logic
- Add Phase 0 pre-flight (H100 test + video script)
- Revised time allocation: Phase 1 expanded to 3.5h, Phase 3 compressed to 0.5h
- Hard SFT fallback at 1.5h into training phase
- Insurance HF Spaces deploy at Checkpoint 2
- Document Action extra='forbid' gotcha and reserved tool names

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

plan/phase-2-environment-core.md CHANGED
@@ -20,23 +20,28 @@
20
 
21
  ## Step-by-Step Build Instructions
22
 
23
- ### Step 1: environment.py -- Core Class (60 min)
24
 
25
- This is the most critical file. Follow the OpenEnv patterns exactly.
26
 
27
- **OpenEnv API Contract (from installed code):**
28
- - `Environment` is `ABC, Generic[ActT, ObsT, StateT]`
29
- - `reset(self, seed=None, episode_id=None, **kwargs) -> ObsT`
30
- - `step(self, action: ActT, timeout_s=None, **kwargs) -> ObsT`
31
- - `state` is a `@property` returning `StateT`
32
  - `SUPPORTS_CONCURRENT_SESSIONS: bool = True` (class attribute)
 
 
 
33
 
34
  ```python
 
35
  import random
36
  from uuid import uuid4
37
  from typing import Any, Dict, List, Optional
38
 
39
- from openenv.core.env_server.interfaces import Environment
 
40
  from openenv.core.env_server.types import State
41
 
42
  from .models import (
@@ -53,7 +58,7 @@ from .rewards import compute_attacker_reward, compute_worker_reward, compute_ove
53
  from .task_generator import generate_tasks, generate_customers, generate_invoices, generate_tickets
54
 
55
 
56
- class SentinelOpsArena(Environment[SentinelAction, SentinelObservation, SentinelState]):
57
  SUPPORTS_CONCURRENT_SESSIONS = True
58
 
59
  NUM_CUSTOMERS = 15
@@ -63,7 +68,132 @@ class SentinelOpsArena(Environment[SentinelAction, SentinelObservation, Sentinel
63
  MAX_TICKS = 30
64
 
65
  def __init__(self):
66
- super().__init__()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
  self._state = SentinelState(episode_id=str(uuid4()), step_count=0)
68
  self.crm = CRMSystem()
69
  self.billing = BillingSystem()
@@ -116,7 +246,10 @@ class SentinelOpsArena(Environment[SentinelAction, SentinelObservation, Sentinel
116
 
117
  return self._make_observation(AgentRole.ATTACKER, reward=0.0, done=False)
118
 
119
- def step(self, action: SentinelAction, timeout_s=None, **kwargs) -> SentinelObservation:
 
 
 
120
  expected_agent = self.turn_order[self.current_agent_idx]
121
 
122
  # Validate agent turn
@@ -536,13 +669,35 @@ Scores: {...}
536
  CHECKPOINT 1 PASSED
537
  ```
538
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
539
  ### Also verify the HTTP server works:
540
  ```bash
541
- cd sentinelops_arena
542
  python -c "
543
  from openenv.core.env_server.http_server import create_app
544
- from models import SentinelAction, SentinelObservation
545
- from environment import SentinelOpsArena
546
  app = create_app(SentinelOpsArena, SentinelAction, SentinelObservation, env_name='sentinelops_arena')
547
  print('create_app() OK')
548
  "
@@ -554,8 +709,10 @@ print('create_app() OK')
554
 
555
  | Issue | Cause | Fix |
556
  |-------|-------|-----|
557
- | `TypeError: Environment.__init__() takes 1 positional argument` | Forgot `super().__init__()` | Call `super().__init__()` in `__init__` |
 
558
  | `state is not a property` | Defined `def state()` instead of `@property def state` | Use `@property` decorator |
 
559
  | Turn order not advancing | `current_agent_idx` not updating | Check modulo arithmetic: `(idx + 1) % 3` |
560
  | Tick not incrementing | Forgot tick advance on full rotation | `if current_agent_idx == 0: tick += 1` |
561
  | Episode never ends | `done` condition wrong | Check `self.tick >= self.MAX_TICKS` after advancing |
@@ -575,6 +732,9 @@ print('create_app() OK')
575
  - [ ] Rewards compute without errors (all 3 reward functions)
576
  - [ ] Wrong-turn actions receive penalty
577
  - [ ] `demo.py` runs a full episode without crashing
 
 
 
578
  - [ ] `create_app()` creates a valid ASGI app
579
 
580
  ---
 
20
 
21
  ## Step-by-Step Build Instructions
22
 
23
+ ### Step 1: environment.py -- Core Class with MCPEnvironment (75 min)
24
 
25
+ This is the most critical file. Use `MCPEnvironment` as the base class.
26
 
27
+ **MCPEnvironment API Contract (from installed code):**
28
+ - `MCPEnvironment` extends `Environment`, takes a `FastMCP` server in `__init__`
29
+ - `step()` auto-routes `ListToolsAction` -> `_handle_list_tools()` and `CallToolAction` -> `_handle_call_tool()`
30
+ - All other actions go to abstract `_step_impl(self, action, timeout_s=None, **kwargs) -> Observation`
31
+ - `reset()` and `state` are still abstract (inherited from `Environment`)
32
  - `SUPPORTS_CONCURRENT_SESSIONS: bool = True` (class attribute)
33
+ - **RESERVED TOOL NAMES:** `reset`, `step`, `state`, `close` CANNOT be used as MCP tool names
34
+
35
+ **Architecture:** MCP tools (enterprise system APIs) are defined as FastMCP tools inside `__init__`. MCPEnvironment auto-routes `CallToolAction` to these tools. Non-MCP actions (turn management, game logic) go through `_step_impl`.
36
 
37
  ```python
38
+ import json
39
  import random
40
  from uuid import uuid4
41
  from typing import Any, Dict, List, Optional
42
 
43
+ from fastmcp import FastMCP
44
+ from openenv.core.env_server.mcp_environment import MCPEnvironment
45
  from openenv.core.env_server.types import State
46
 
47
  from .models import (
 
58
  from .task_generator import generate_tasks, generate_customers, generate_invoices, generate_tickets
59
 
60
 
61
+ class SentinelOpsArena(MCPEnvironment):
62
  SUPPORTS_CONCURRENT_SESSIONS = True
63
 
64
  NUM_CUSTOMERS = 15
 
68
  MAX_TICKS = 30
69
 
70
  def __init__(self):
71
+ # Create FastMCP server with enterprise system tools
72
+ mcp = FastMCP("sentinelops")
73
+
74
+ # --- Worker tools (enterprise system APIs) ---
75
+ @mcp.tool()
76
+ def lookup_customer(customer_id: str) -> str:
77
+ """Look up a customer record in the CRM system."""
78
+ return json.dumps(self.crm.lookup_customer(customer_id))
79
+
80
+ @mcp.tool()
81
+ def update_tier(customer_id: str, new_tier: str) -> str:
82
+ """Update a customer's tier level (gold/silver/bronze)."""
83
+ return json.dumps(self.crm.update_tier(customer_id, new_tier))
84
+
85
+ @mcp.tool()
86
+ def add_note(customer_id: str, note: str) -> str:
87
+ """Add a note to a customer's record."""
88
+ return json.dumps(self.crm.add_note(customer_id, note))
89
+
90
+ @mcp.tool()
91
+ def get_history(customer_id: str) -> str:
92
+ """Get interaction history for a customer."""
93
+ return json.dumps(self.crm.get_history(customer_id))
94
+
95
+ @mcp.tool()
96
+ def check_balance(customer_id: str) -> str:
97
+ """Check the billing balance for a customer."""
98
+ return json.dumps(self.billing.check_balance(customer_id))
99
+
100
+ @mcp.tool()
101
+ def issue_refund(invoice_id: str, amount: float, reason: str) -> str:
102
+ """Issue a refund for an invoice. Must comply with current refund policy."""
103
+ return json.dumps(self.billing.issue_refund(invoice_id, amount, reason))
104
+
105
+ @mcp.tool()
106
+ def apply_credit(customer_id: str, amount: float) -> str:
107
+ """Apply a credit to a customer's account."""
108
+ return json.dumps(self.billing.apply_credit(customer_id, amount))
109
+
110
+ @mcp.tool()
111
+ def generate_invoice(customer_id: str, items: str, amount: float) -> str:
112
+ """Generate a new invoice. Items should be comma-separated."""
113
+ item_list = [i.strip() for i in items.split(",")]
114
+ return json.dumps(self.billing.generate_invoice(customer_id, item_list, amount))
115
+
116
+ @mcp.tool()
117
+ def create_ticket(customer_id: str, subject: str, priority: str = "medium") -> str:
118
+ """Create a new support ticket."""
119
+ return json.dumps(self.ticketing.create_ticket(
120
+ customer_id, subject, TicketPriority(priority)))
121
+
122
+ @mcp.tool()
123
+ def assign_ticket(ticket_id: str, agent_name: str) -> str:
124
+ """Assign a ticket to an agent."""
125
+ return json.dumps(self.ticketing.assign_ticket(ticket_id, agent_name))
126
+
127
+ @mcp.tool()
128
+ def escalate_ticket(ticket_id: str, reason: str) -> str:
129
+ """Escalate a ticket to a senior agent."""
130
+ return json.dumps(self.ticketing.escalate(ticket_id, reason))
131
+
132
+ @mcp.tool()
133
+ def resolve_ticket(ticket_id: str, resolution: str) -> str:
134
+ """Resolve a ticket with the given resolution."""
135
+ return json.dumps(self.ticketing.resolve(ticket_id, resolution))
136
+
137
+ @mcp.tool()
138
+ def check_sla(ticket_id: str) -> str:
139
+ """Check SLA status for a ticket (ticks remaining before breach)."""
140
+ return json.dumps(self.ticketing.check_sla(ticket_id))
141
+
142
+ @mcp.tool()
143
+ def get_schema(system: str) -> str:
144
+ """Get current field schema for a system. Critical after schema drift."""
145
+ sys_obj = self._get_system(system)
146
+ if sys_obj is None:
147
+ return json.dumps({"error": f"Unknown system: {system}"})
148
+ return json.dumps(sys_obj.get_schema())
149
+
150
+ @mcp.tool()
151
+ def get_current_policy(policy_type: str = "refund") -> str:
152
+ """Get the current policy (refund or sla). Critical after policy drift."""
153
+ if policy_type == "refund":
154
+ return json.dumps(self.billing.get_current_policy())
155
+ elif policy_type == "sla":
156
+ return json.dumps(self.ticketing.get_sla_rules())
157
+ return json.dumps({"error": f"Unknown policy type: {policy_type}"})
158
+
159
+ @mcp.tool()
160
+ def launch_attack(attack_type: str, target_system: str,
161
+ parameters_json: str = "{}") -> str:
162
+ """Launch an attack on an enterprise system (attacker only).
163
+ Types: schema_drift, policy_drift, social_engineering, rate_limit."""
164
+ params = json.loads(parameters_json)
165
+ params["attack_type"] = attack_type
166
+ params["target_system"] = target_system
167
+ result = self.attack_manager.launch_attack(
168
+ AttackType(attack_type), TargetSystem(target_system), params, self.tick)
169
+ return json.dumps(result)
170
+
171
+ @mcp.tool()
172
+ def get_attack_budget() -> str:
173
+ """Get remaining attack budget for this episode."""
174
+ budget = self.attack_manager.attack_budget if self.attack_manager else 10.0
175
+ return json.dumps({"budget": budget})
176
+
177
+ @mcp.tool()
178
+ def flag_action(flagged: bool, severity: int = 3,
179
+ violation_type: str = "policy_violation",
180
+ explanation: str = "") -> str:
181
+ """Flag or approve a worker action (oversight only)."""
182
+ return json.dumps({
183
+ "flagged": flagged, "severity": severity,
184
+ "violation_type": violation_type, "explanation": explanation,
185
+ })
186
+
187
+ @mcp.tool()
188
+ def get_trajectory(num_recent: int = 5) -> str:
189
+ """Get recent action trajectory for oversight analysis."""
190
+ trajectory = self.trajectory[-num_recent:] if self.trajectory else []
191
+ return json.dumps(trajectory)
192
+
193
+ # Initialize MCPEnvironment with the FastMCP server
194
+ super().__init__(mcp)
195
+
196
+ # Initialize systems
197
  self._state = SentinelState(episode_id=str(uuid4()), step_count=0)
198
  self.crm = CRMSystem()
199
  self.billing = BillingSystem()
 
246
 
247
  return self._make_observation(AgentRole.ATTACKER, reward=0.0, done=False)
248
 
249
+ def _step_impl(self, action: SentinelAction, timeout_s=None, **kwargs) -> SentinelObservation:
250
+ """Handle non-MCP actions (game logic, turn management).
251
+ MCPEnvironment.step() auto-routes ListToolsAction/CallToolAction
252
+ to the FastMCP server. Everything else comes here."""
253
  expected_agent = self.turn_order[self.current_agent_idx]
254
 
255
  # Validate agent turn
 
669
  CHECKPOINT 1 PASSED
670
  ```
671
 
672
+ ### Also verify MCPEnvironment MCP routing works:
673
+ ```bash
674
+ python -c "
675
+ from openenv.core.env_server.mcp_types import ListToolsAction, CallToolAction
676
+ from sentinelops_arena.environment import SentinelOpsArena
677
+ env = SentinelOpsArena()
678
+ env.reset(seed=42)
679
+
680
+ # Test MCP tool discovery
681
+ obs = env.step(ListToolsAction())
682
+ tool_names = [t.name for t in obs.tools]
683
+ print(f'MCP tools available: {tool_names}')
684
+ assert 'lookup_customer' in tool_names
685
+ assert 'launch_attack' in tool_names
686
+ assert 'reset' not in tool_names # reserved
687
+
688
+ # Test MCP tool call
689
+ obs = env.step(CallToolAction(tool_name='lookup_customer', arguments={'customer_id': 'C000'}))
690
+ print(f'Tool result: {obs.result}')
691
+ print('MCPEnvironment MCP routing OK')
692
+ "
693
+ ```
694
+
695
  ### Also verify the HTTP server works:
696
  ```bash
 
697
  python -c "
698
  from openenv.core.env_server.http_server import create_app
699
+ from sentinelops_arena.models import SentinelAction, SentinelObservation
700
+ from sentinelops_arena.environment import SentinelOpsArena
701
  app = create_app(SentinelOpsArena, SentinelAction, SentinelObservation, env_name='sentinelops_arena')
702
  print('create_app() OK')
703
  "
 
709
 
710
  | Issue | Cause | Fix |
711
  |-------|-------|-----|
712
+ | `TypeError: MCPEnvironment.__init__() missing mcp_server` | Forgot to pass FastMCP to super() | Call `super().__init__(mcp)` with FastMCP instance |
713
+ | `ValueError: MCP tools cannot use reserved names` | Tool named `reset`, `step`, `state`, or `close` | Rename the tool (e.g., `env_reset` -> but better to not overlap at all) |
714
  | `state is not a property` | Defined `def state()` instead of `@property def state` | Use `@property` decorator |
715
+ | `_step_impl not defined` | Forgot to implement abstract method | MCPEnvironment requires `_step_impl()`, not `step()` |
716
  | Turn order not advancing | `current_agent_idx` not updating | Check modulo arithmetic: `(idx + 1) % 3` |
717
  | Tick not incrementing | Forgot tick advance on full rotation | `if current_agent_idx == 0: tick += 1` |
718
  | Episode never ends | `done` condition wrong | Check `self.tick >= self.MAX_TICKS` after advancing |
 
732
  - [ ] Rewards compute without errors (all 3 reward functions)
733
  - [ ] Wrong-turn actions receive penalty
734
  - [ ] `demo.py` runs a full episode without crashing
735
+ - [ ] `ListToolsAction` returns all MCP tools (via MCPEnvironment auto-routing)
736
+ - [ ] `CallToolAction` successfully calls enterprise system tools
737
+ - [ ] No reserved tool names used (`reset`, `step`, `state`, `close`)
738
  - [ ] `create_app()` creates a valid ASGI app
739
 
740
  ---
plan/phase-3-mcp-and-server.md CHANGED
@@ -1,8 +1,10 @@
1
- # Phase 3: MCP Tools + OpenEnv HTTP Server + MCP-X Gateway
2
 
3
- **Time:** 1.5 hours (Hours 4-5.5)
4
- **Priority:** HIGH -- unlocks demo and satisfies Pipeline judging criterion (10%)
5
- **Depends on:** Phase 2 (working environment)
 
 
6
 
7
  ---
8
 
@@ -10,32 +12,32 @@
10
 
11
  | File | Purpose | Est. Time |
12
  |------|---------|-----------|
13
- | `sentinelops_arena/mcp_tools.py` | FastMCP tool definitions wrapping env operations | 30 min |
14
- | `sentinelops_arena/server.py` | `create_app()` HTTP server entry point | 15 min |
15
- | `mcp-x/config.toml` | MCP-X per-agent access control config | 10 min |
16
- | `mcp-x/mcp_x.py` | Copy from envbeats, no modifications needed | 5 min |
17
- | `run_server.py` | Script to start both env server + MCP-X | 10 min |
18
- | `tests/test_mcp.py` | MCP tool integration tests | 20 min |
19
 
20
  ---
21
 
22
  ## Step-by-Step Build Instructions
23
 
24
- ### Step 1: server.py -- OpenEnv HTTP Server (15 min)
25
 
26
- Follow the hackathon_env template exactly.
27
 
28
  ```python
29
  # sentinelops_arena/server.py
30
  """
31
- FastAPI application for SentinelOps Arena.
32
 
33
  Endpoints:
34
  POST /reset -- Reset environment
35
- POST /step -- Execute an action
36
  GET /state -- Get current state
37
  GET /schema -- Get action/observation schemas
38
- WS /ws -- WebSocket for persistent sessions
 
 
 
39
 
40
  Usage:
41
  uvicorn sentinelops_arena.server:app --host 0.0.0.0 --port 8000
@@ -65,394 +67,120 @@ if __name__ == "__main__":
65
  main(port=args.port)
66
  ```
67
 
68
- ### Step 2: mcp_tools.py -- FastMCP Tool Definitions (30 min)
69
-
70
- Expose enterprise system APIs as individual MCP tools. This is what LLM agents actually call.
71
-
72
- ```python
73
- # sentinelops_arena/mcp_tools.py
74
- """
75
- MCP tool definitions for SentinelOps Arena.
76
-
77
- Exposes enterprise system APIs as MCP tools via FastMCP.
78
- Tools are grouped by agent role (attacker/worker/oversight).
79
- """
80
- import json
81
- from fastmcp import FastMCP
82
-
83
- from .environment import SentinelOpsArena
84
- from .models import (
85
- SentinelAction, AgentRole, AttackType, TargetSystem,
86
- TicketPriority,
87
- )
88
-
89
- mcp = FastMCP("sentinelops", host="0.0.0.0", port=9500, stateless_http=True)
90
-
91
- # Global environment instance (shared across MCP calls)
92
- env = SentinelOpsArena()
93
-
94
-
95
- # ============ Environment Control Tools ============
96
-
97
- @mcp.tool()
98
- def reset(seed: int = 42) -> str:
99
- """Reset the SentinelOps environment for a new episode."""
100
- obs = env.reset(seed=seed)
101
- return obs.model_dump_json()
102
-
103
-
104
- @mcp.tool()
105
- def step(action_json: str) -> str:
106
- """Take a step in the SentinelOps environment with a full action."""
107
- action = SentinelAction.model_validate_json(action_json)
108
- obs = env.step(action)
109
- return obs.model_dump_json()
110
-
111
-
112
- @mcp.tool()
113
- def get_state() -> str:
114
- """Get the current environment state (tick, scores, active attacks)."""
115
- return env.state.model_dump_json()
116
-
117
-
118
- # ============ Worker Tools (Enterprise System APIs) ============
119
-
120
- @mcp.tool()
121
- def lookup_customer(customer_id: str) -> str:
122
- """Look up a customer record in the CRM system."""
123
- result = env.crm.lookup_customer(customer_id)
124
- return json.dumps(result)
125
-
126
-
127
- @mcp.tool()
128
- def update_tier(customer_id: str, new_tier: str) -> str:
129
- """Update a customer's tier level (gold/silver/bronze)."""
130
- result = env.crm.update_tier(customer_id, new_tier)
131
- return json.dumps(result)
132
-
133
-
134
- @mcp.tool()
135
- def add_note(customer_id: str, note: str) -> str:
136
- """Add a note to a customer's record."""
137
- result = env.crm.add_note(customer_id, note)
138
- return json.dumps(result)
139
-
140
-
141
- @mcp.tool()
142
- def get_history(customer_id: str) -> str:
143
- """Get interaction history for a customer."""
144
- result = env.crm.get_history(customer_id)
145
- return json.dumps(result)
146
-
147
-
148
- @mcp.tool()
149
- def check_balance(customer_id: str) -> str:
150
- """Check the billing balance for a customer."""
151
- result = env.billing.check_balance(customer_id)
152
- return json.dumps(result)
153
-
154
-
155
- @mcp.tool()
156
- def issue_refund(invoice_id: str, amount: float, reason: str) -> str:
157
- """Issue a refund for an invoice. Must comply with current refund policy."""
158
- result = env.billing.issue_refund(invoice_id, amount, reason)
159
- return json.dumps(result)
160
-
161
-
162
- @mcp.tool()
163
- def apply_credit(customer_id: str, amount: float) -> str:
164
- """Apply a credit to a customer's account."""
165
- result = env.billing.apply_credit(customer_id, amount)
166
- return json.dumps(result)
167
-
168
-
169
- @mcp.tool()
170
- def generate_invoice(customer_id: str, items: str, amount: float) -> str:
171
- """Generate a new invoice for a customer. Items should be comma-separated."""
172
- item_list = [i.strip() for i in items.split(",")]
173
- result = env.billing.generate_invoice(customer_id, item_list, amount)
174
- return json.dumps(result)
175
-
176
-
177
- @mcp.tool()
178
- def create_ticket(customer_id: str, subject: str, priority: str = "medium") -> str:
179
- """Create a new support ticket."""
180
- result = env.ticketing.create_ticket(customer_id, subject, TicketPriority(priority))
181
- return json.dumps(result)
182
-
183
-
184
- @mcp.tool()
185
- def assign_ticket(ticket_id: str, agent_name: str) -> str:
186
- """Assign a ticket to an agent."""
187
- result = env.ticketing.assign_ticket(ticket_id, agent_name)
188
- return json.dumps(result)
189
-
190
-
191
- @mcp.tool()
192
- def escalate_ticket(ticket_id: str, reason: str) -> str:
193
- """Escalate a ticket to a senior agent."""
194
- result = env.ticketing.escalate(ticket_id, reason)
195
- return json.dumps(result)
196
-
197
-
198
- @mcp.tool()
199
- def resolve_ticket(ticket_id: str, resolution: str) -> str:
200
- """Resolve a ticket with the given resolution."""
201
- result = env.ticketing.resolve(ticket_id, resolution)
202
- return json.dumps(result)
203
-
204
-
205
- @mcp.tool()
206
- def check_sla(ticket_id: str) -> str:
207
- """Check SLA status for a ticket (ticks remaining before breach)."""
208
- result = env.ticketing.check_sla(ticket_id)
209
- return json.dumps(result)
210
-
211
-
212
- @mcp.tool()
213
- def get_schema(system: str) -> str:
214
- """Get the current field schema for a system (crm/billing/ticketing).
215
- Critical after schema drift attacks -- fields may have been renamed."""
216
- sys_obj = env._get_system(system)
217
- if sys_obj is None:
218
- return json.dumps({"error": f"Unknown system: {system}"})
219
- return json.dumps(sys_obj.get_schema())
220
-
221
 
222
- @mcp.tool()
223
- def get_current_policy(policy_type: str = "refund") -> str:
224
- """Get the current policy (refund or sla).
225
- Critical after policy drift attacks -- rules may have changed."""
226
- if policy_type == "refund":
227
- return json.dumps(env.billing.get_current_policy())
228
- elif policy_type == "sla":
229
- return json.dumps(env.ticketing.get_sla_rules())
230
- return json.dumps({"error": f"Unknown policy type: {policy_type}"})
231
-
232
-
233
- # ============ Attacker Tools ============
234
-
235
- @mcp.tool()
236
- def launch_attack(attack_type: str, target_system: str, parameters_json: str = "{}") -> str:
237
- """Launch an attack on an enterprise system.
238
- Types: schema_drift, policy_drift, social_engineering, rate_limit.
239
- Costs 0.3 reward points per attack."""
240
- import json as _json
241
- params = _json.loads(parameters_json)
242
- params["attack_type"] = attack_type
243
- params["target_system"] = target_system
244
- result = env.attack_manager.launch_attack(
245
- AttackType(attack_type), TargetSystem(target_system), params, env.tick
246
- )
247
- return json.dumps(result)
248
-
249
-
250
- @mcp.tool()
251
- def pass_turn() -> str:
252
- """Pass the attacker's turn without launching an attack."""
253
- return json.dumps({"status": "passed"})
254
-
255
-
256
- @mcp.tool()
257
- def get_attack_budget() -> str:
258
- """Get the remaining attack budget for this episode."""
259
- budget = env.attack_manager.attack_budget if env.attack_manager else 10.0
260
- return json.dumps({"budget": budget})
261
-
262
-
263
- # ============ Oversight Tools ============
264
-
265
- @mcp.tool()
266
- def flag_action(flagged: bool, severity: int = 3,
267
- violation_type: str = "policy_violation",
268
- explanation: str = "") -> str:
269
- """Flag or approve a worker action. Used by the oversight agent."""
270
- return json.dumps({
271
- "flagged": flagged,
272
- "severity": severity,
273
- "violation_type": violation_type,
274
- "explanation": explanation,
275
- })
276
-
277
-
278
- @mcp.tool()
279
- def get_trajectory(num_recent: int = 5) -> str:
280
- """Get recent action trajectory for oversight analysis."""
281
- trajectory = env.trajectory[-num_recent:] if env.trajectory else []
282
- return json.dumps(trajectory)
283
- ```
284
-
285
- ### Step 3: MCP-X Gateway Config (10 min)
286
-
287
- ```toml
288
- # mcp-x/config.toml
289
- [clients]
290
- [clients.orchestrator]
291
- auth_token = "orch-token-001"
292
-
293
- [clients.attacker]
294
- auth_token = "atk-token-001"
295
-
296
- [clients.worker]
297
- auth_token = "wrk-token-001"
298
-
299
- [clients.oversight]
300
- auth_token = "ovs-token-001"
301
-
302
- [mcp_servers]
303
- [mcp_servers.sentinelops]
304
- url = "http://localhost:9500/mcp/"
305
- from_client = "orchestrator"
306
-
307
- [allow]
308
- [allow.sentinelops]
309
- attacker = ["launch_attack", "pass_turn", "get_attack_budget", "step", "reset", "get_state"]
310
- worker = ["lookup_customer", "update_tier", "add_note", "get_history", "check_balance", "issue_refund", "apply_credit", "generate_invoice", "create_ticket", "assign_ticket", "escalate_ticket", "resolve_ticket", "check_sla", "get_schema", "get_current_policy", "step", "reset", "get_state"]
311
- oversight = ["flag_action", "get_current_policy", "get_trajectory", "step", "reset", "get_state"]
312
- ```
313
-
314
- ### Step 4: Copy MCP-X (5 min)
315
-
316
- Copy `envbeats/mcp-x/mcp_x.py` to `mcp-x/mcp_x.py`. No modifications needed -- it reads from `config.toml` in its working directory.
317
-
318
- ```bash
319
- cp envbeats/mcp-x/mcp_x.py mcp-x/mcp_x.py
320
- ```
321
-
322
- ### Step 5: run_server.py -- Start Script (10 min)
323
-
324
- ```python
325
- # run_server.py
326
- """Start both the OpenEnv HTTP server and MCP server."""
327
- import subprocess
328
- import sys
329
- import time
330
-
331
- def main():
332
- # Start OpenEnv HTTP server on port 8000
333
- env_proc = subprocess.Popen([
334
- sys.executable, "-m", "uvicorn",
335
- "sentinelops_arena.server:app",
336
- "--host", "0.0.0.0", "--port", "8000",
337
- ])
338
-
339
- # Start FastMCP server on port 9500
340
- mcp_proc = subprocess.Popen([
341
- sys.executable, "-c",
342
- "from sentinelops_arena.mcp_tools import mcp; mcp.run()"
343
- ])
344
-
345
- # Start MCP-X gateway on port 9000
346
- mcpx_proc = subprocess.Popen([
347
- sys.executable, "mcp-x/mcp_x.py", "--port", "9000"
348
- ])
349
-
350
- print("Servers started:")
351
- print(" OpenEnv HTTP: http://localhost:8000")
352
- print(" MCP (FastMCP): http://localhost:9500")
353
- print(" MCP-X Gateway: http://localhost:9000")
354
-
355
- try:
356
- env_proc.wait()
357
- except KeyboardInterrupt:
358
- env_proc.terminate()
359
- mcp_proc.terminate()
360
- mcpx_proc.terminate()
361
-
362
- if __name__ == "__main__":
363
- main()
364
- ```
365
-
366
- ---
367
-
368
- ## VERIFY
369
-
370
- ### Test 1: OpenEnv HTTP Server
371
  ```bash
372
  # Start server
373
  uvicorn sentinelops_arena.server:app --port 8000 &
374
 
375
  # Test reset
376
  curl -X POST http://localhost:8000/reset -H "Content-Type: application/json" -d '{}'
377
- # Should return: {"observation": {...}, "reward": null, "done": false}
378
 
379
- # Test step
380
  curl -X POST http://localhost:8000/step -H "Content-Type: application/json" \
381
  -d '{"action": {"agent": "attacker", "action_type": "pass"}}'
382
- # Should return observation for worker
 
 
 
 
 
 
 
 
 
383
 
384
  # Test state
385
  curl http://localhost:8000/state
386
- # Should return: {"episode_id": "...", "step_count": 1, "tick": 0, ...}
387
 
388
  # Test schema
389
  curl http://localhost:8000/schema
390
- # Should return action/observation/state JSON schemas
391
 
392
  kill %1
393
  ```
394
 
395
- ### Test 2: MCP Tools (FastMCP)
 
396
  ```python
397
- # Start MCP server first, then:
398
- from mcp.client.streamable_http import streamablehttp_client
399
- from mcp.client.session import ClientSession
400
  import asyncio
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
401
 
402
- async def test_mcp():
403
- async with streamablehttp_client(url="http://localhost:9500/mcp/") as (read, write, _):
404
- async with ClientSession(read, write) as session:
405
- await session.initialize()
 
 
 
 
 
406
 
407
- # List tools
408
- tools = await session.list_tools()
409
- tool_names = [t.name for t in tools.tools]
410
- print(f"Available tools: {tool_names}")
411
- assert "reset" in tool_names
412
- assert "step" in tool_names
413
- assert "lookup_customer" in tool_names
414
 
415
- # Call reset
416
- result = await session.call_tool("reset", {"seed": 42})
417
- print(f"Reset result: {result.content[0].text[:100]}")
 
 
 
 
418
 
419
- # Call get_state
420
- result = await session.call_tool("get_state", {})
421
- print(f"State: {result.content[0].text[:100]}")
422
 
423
- asyncio.run(test_mcp())
 
 
 
 
 
 
424
  ```
425
 
426
- ### Test 3: MCP-X Gateway (Per-Agent Isolation)
427
- ```python
428
- import asyncio
429
- from mcp.client.streamable_http import streamablehttp_client
430
- from mcp.client.session import ClientSession
431
-
432
- async def test_mcpx():
433
- # Worker should see worker tools
434
- headers = {"Authorization": "Bearer wrk-token-001"}
435
- async with streamablehttp_client(url="http://localhost:9000/mcp/", headers=headers) as (r, w, _):
436
- async with ClientSession(r, w) as session:
437
- await session.initialize()
438
- tools = await session.list_tools()
439
- names = [t.name for t in tools.tools]
440
- print(f"Worker tools: {names}")
441
- assert "lookup_customer" in names
442
- assert "launch_attack" not in names # worker cannot attack
443
-
444
- # Attacker should see attacker tools
445
- headers = {"Authorization": "Bearer atk-token-001"}
446
- async with streamablehttp_client(url="http://localhost:9000/mcp/", headers=headers) as (r, w, _):
447
- async with ClientSession(r, w) as session:
448
- await session.initialize()
449
- tools = await session.list_tools()
450
- names = [t.name for t in tools.tools]
451
- print(f"Attacker tools: {names}")
452
- assert "launch_attack" in names
453
- assert "lookup_customer" not in names # attacker cannot use CRM
454
-
455
- asyncio.run(test_mcpx())
456
  ```
457
 
458
  ---
@@ -461,38 +189,30 @@ asyncio.run(test_mcpx())
461
 
462
  | Issue | Cause | Fix |
463
  |-------|-------|-----|
464
- | `Port 8000/9500/9000 already in use` | Previous server still running | `kill $(lsof -t -i:PORT)` |
465
- | `ConnectionRefused on MCP-X` | MCP server not started before MCP-X | Start env server + MCP server before MCP-X |
466
- | FastMCP `stateless_http=True` not working | Wrong FastMCP version | Check `pip show fastmcp` -- need recent version |
467
- | MCP-X `ProxyClient` error | Dummy server hack missing | Ensure `_dummy_0` and `_dummy_1` servers in config |
468
- | `streamablehttp_client` connection error | Async context manager issue | Must use `async with` pattern |
469
- | `Bearer token` rejected | Token mismatch with config.toml | Verify token strings match exactly |
470
- | MCP tool returns empty | Environment not reset | Call `reset` before other tools |
471
- | `model_dump_json()` fails on complex types | Pydantic serialization issue | Use `json.dumps()` for dict results, `model_dump_json()` for Pydantic models |
472
 
473
  ---
474
 
475
  ## EXIT CRITERIA
476
 
477
  - [ ] `uvicorn sentinelops_arena.server:app` starts without errors
478
- - [ ] HTTP `/reset`, `/step`, `/state`, `/schema` all return valid JSON
479
- - [ ] FastMCP server starts on port 9500
480
- - [ ] All MCP tools are discoverable via `list_tools`
481
- - [ ] `reset`, `step`, `get_state` MCP tools work
482
- - [ ] `lookup_customer`, `issue_refund`, etc. return valid data
483
- - [ ] MCP-X gateway starts on port 9000
484
- - [ ] Worker token sees only worker tools
485
- - [ ] Attacker token sees only attacker tools
486
- - [ ] Oversight token sees only oversight tools
487
- - [ ] Cross-role tool access denied (worker can't call launch_attack)
488
 
489
  ---
490
 
491
  ## ROLLBACK PLAN
492
 
493
- If Phase 3 takes longer than 1.5 hours:
494
- 1. **Cut MCP-X gateway** -- submit with direct MCP only (no per-agent isolation). Add MCP-X in Phase 6 polish.
495
- 2. **Reduce MCP tools** -- only expose `reset`, `step`, `get_state` (no individual system tools). Agents call `step()` with full actions.
496
- 3. **Cut MCP entirely** -- use only HTTP server. Agents call REST endpoints directly.
497
 
498
  Do NOT cut: `server.py` with `create_app()`. This is required for HF Spaces deployment.
 
1
+ # Phase 3: MCP + OpenEnv HTTP Server
2
 
3
+ **Time:** 0.5 hours (Hours 6-6.5)
4
+ **Priority:** MEDIUM -- MCPEnvironment did most of the work in Phase 2
5
+ **Depends on:** Phase 2 (working environment with MCP tools)
6
+
7
+ **KEY CHANGE:** MCPEnvironment handles MCP tool routing automatically. Phase 3 is now just creating the HTTP server entry point and verifying everything works end-to-end. MCP-X gateway is CUT.
8
 
9
  ---
10
 
 
12
 
13
  | File | Purpose | Est. Time |
14
  |------|---------|-----------|
15
+ | `sentinelops_arena/server.py` | `create_app()` HTTP server entry point | 10 min |
16
+ | Verify MCP tools via HTTP | End-to-end test | 10 min |
17
+ | Verify WebSocket + MCP | Integration test | 10 min |
 
 
 
18
 
19
  ---
20
 
21
  ## Step-by-Step Build Instructions
22
 
23
+ ### Step 1: server.py -- OpenEnv HTTP Server (10 min)
24
 
25
+ This is trivial -- follow the hackathon_env template exactly.
26
 
27
  ```python
28
  # sentinelops_arena/server.py
29
  """
30
+ HTTP server for SentinelOps Arena.
31
 
32
  Endpoints:
33
  POST /reset -- Reset environment
34
+ POST /step -- Execute an action (including ListToolsAction, CallToolAction)
35
  GET /state -- Get current state
36
  GET /schema -- Get action/observation schemas
37
+ WS /ws -- WebSocket for persistent sessions (supports /mcp)
38
+
39
+ The MCPEnvironment base class handles MCP tool routing automatically.
40
+ Agents can discover tools via ListToolsAction and call them via CallToolAction.
41
 
42
  Usage:
43
  uvicorn sentinelops_arena.server:app --host 0.0.0.0 --port 8000
 
67
  main(port=args.port)
68
  ```
69
 
70
+ ### Step 2: Verify HTTP + MCP Integration (10 min)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
72
  ```bash
73
  # Start server
74
  uvicorn sentinelops_arena.server:app --port 8000 &
75
 
76
  # Test reset
77
  curl -X POST http://localhost:8000/reset -H "Content-Type: application/json" -d '{}'
 
78
 
79
+ # Test step (regular action)
80
  curl -X POST http://localhost:8000/step -H "Content-Type: application/json" \
81
  -d '{"action": {"agent": "attacker", "action_type": "pass"}}'
82
+
83
+ # Test step (MCP list_tools -- auto-routed by MCPEnvironment)
84
+ curl -X POST http://localhost:8000/step -H "Content-Type: application/json" \
85
+ -d '{"action": {"type": "list_tools"}}'
86
+ # Should return available MCP tools
87
+
88
+ # Test step (MCP call_tool -- auto-routed by MCPEnvironment)
89
+ curl -X POST http://localhost:8000/step -H "Content-Type: application/json" \
90
+ -d '{"action": {"type": "call_tool", "tool_name": "lookup_customer", "arguments": {"customer_id": "C000"}}}'
91
+ # Should return customer data
92
 
93
  # Test state
94
  curl http://localhost:8000/state
 
95
 
96
  # Test schema
97
  curl http://localhost:8000/schema
 
98
 
99
  kill %1
100
  ```
101
 
102
+ ### Step 3: Verify WebSocket MCP Path (10 min)
103
+
104
  ```python
105
+ # Quick WebSocket test
 
 
106
  import asyncio
107
+ import json
108
+ import websockets
109
+
110
+ async def test_ws():
111
+ async with websockets.connect("ws://localhost:8000/ws") as ws:
112
+ # Reset
113
+ await ws.send(json.dumps({"type": "reset", "data": {"seed": 42}}))
114
+ resp = json.loads(await ws.recv())
115
+ print(f"Reset: {resp['type']}")
116
+
117
+ # MCP via WebSocket
118
+ await ws.send(json.dumps({
119
+ "type": "mcp",
120
+ "data": {"method": "tools/list", "params": {}, "id": 1}
121
+ }))
122
+ resp = json.loads(await ws.recv())
123
+ print(f"MCP tools via WS: {resp}")
124
+
125
+ asyncio.run(test_ws())
126
+ ```
127
+
128
+ ---
129
+
130
+ ## What MCPEnvironment Gives Us For Free
131
 
132
+ | Feature | How |
133
+ |---------|-----|
134
+ | MCP tool discovery | `ListToolsAction` -> returns all tools with schemas |
135
+ | MCP tool invocation | `CallToolAction(tool_name, arguments)` -> calls FastMCP tool |
136
+ | Reserved name validation | Rejects tools named `reset`, `step`, `state`, `close` |
137
+ | Timeout handling | Configurable timeout on tool calls |
138
+ | Error categorization | `ToolError` with types: execution_error, invalid_args, tool_not_found, timeout |
139
+ | WebSocket MCP path | `/ws` endpoint supports `type: "mcp"` messages |
140
+ | Async support | `_run_async_safely()` handles both sync and async contexts |
141
 
142
+ ## What We DON'T Need (CUT)
 
 
 
 
 
 
143
 
144
+ | Removed | Reason |
145
+ |---------|--------|
146
+ | `mcp_tools.py` | MCP tools defined inside `environment.py` via FastMCP |
147
+ | `mcp-x/` directory | MCP-X gateway CUT -- MCPEnvironment handles tool exposure |
148
+ | `config.toml` | No MCP-X = no per-agent access control config |
149
+ | `run_server.py` | Single server is enough |
150
+ | Per-agent JWT tokens | Nice-to-have, not needed for demo/judging |
151
 
152
+ ---
 
 
153
 
154
+ ## VERIFY
155
+
156
+ ### Test 1: HTTP Server starts
157
+ ```bash
158
+ uvicorn sentinelops_arena.server:app --port 8000
159
+ # Should start without errors
160
+ # Should show "Uvicorn running on http://0.0.0.0:8000"
161
  ```
162
 
163
+ ### Test 2: All endpoints return valid JSON
164
+ ```bash
165
+ # Reset -> Observation JSON
166
+ # Step -> Observation JSON
167
+ # State -> State JSON
168
+ # Schema -> Action/Observation/State schemas
169
+ ```
170
+
171
+ ### Test 3: MCP tools discoverable via HTTP
172
+ ```bash
173
+ # POST /step with ListToolsAction -> list of tools
174
+ # Verify: lookup_customer, issue_refund, get_schema, launch_attack etc. all present
175
+ # Verify: no reserved names (reset, step, state, close)
176
+ ```
177
+
178
+ ### Test 4: MCP tools callable via HTTP
179
+ ```bash
180
+ # POST /step with CallToolAction -> tool result
181
+ # Call lookup_customer("C000") -> customer data
182
+ # Call get_schema("crm") -> field list
183
+ # Call get_current_policy("refund") -> policy values
 
 
 
 
 
 
 
 
 
184
  ```
185
 
186
  ---
 
189
 
190
  | Issue | Cause | Fix |
191
  |-------|-------|-----|
192
+ | `Port 8000 already in use` | Previous server running | `kill $(lsof -t -i:8000)` |
193
+ | `create_app()` fails with type error | Wrong argument types | Pass class (not instance), Action class, Observation class |
194
+ | MCP tools not showing up | Tools defined after `super().__init__()` | Define tools BEFORE calling `super().__init__(mcp)` |
195
+ | `ValueError: reserved names` | Tool named `reset` or `step` | Rename the tool |
196
+ | WebSocket MCP not working | Wrong message format | Use `{"type": "mcp", "data": {"method": "tools/list", ...}}` |
197
+ | `ListToolsAction` not recognized | `create_app` doesn't know about MCP types | May need to pass both `SentinelAction` and MCP action types to create_app |
 
 
198
 
199
  ---
200
 
201
  ## EXIT CRITERIA
202
 
203
  - [ ] `uvicorn sentinelops_arena.server:app` starts without errors
204
+ - [ ] HTTP `/reset`, `/step`, `/state`, `/schema` return valid JSON
205
+ - [ ] `ListToolsAction` via `/step` returns all enterprise system tools
206
+ - [ ] `CallToolAction` via `/step` successfully calls tools
207
+ - [ ] WebSocket `/ws` endpoint accepts connections
 
 
 
 
 
 
208
 
209
  ---
210
 
211
  ## ROLLBACK PLAN
212
 
213
+ Phase 3 is already minimal. If it takes longer than 30 minutes:
214
+ 1. **Skip WebSocket verification** -- HTTP-only is fine for demo
215
+ 2. **Skip schema endpoint check** -- not needed for judging
216
+ 3. **If `create_app()` fails entirely** -- serve the Gradio app directly without the OpenEnv HTTP layer. The environment still works via direct Python calls.
217
 
218
  Do NOT cut: `server.py` with `create_app()`. This is required for HF Spaces deployment.
plan/phase-4-demo-and-ui.md CHANGED
@@ -1,8 +1,10 @@
1
  # Phase 4: Demo Script + Gradio App + HF Spaces Deployment
2
 
3
- **Time:** 2 hours (Hours 5.5-7.5)
4
- **Priority:** HIGH -- Storytelling is 30% of judging
5
- **Depends on:** Phase 3 (MCP + server working)
 
 
6
 
7
  ---
8
 
 
1
  # Phase 4: Demo Script + Gradio App + HF Spaces Deployment
2
 
3
+ **Time:** 2 hours (Hours 6.5-8.5)
4
+ **Priority:** HIGH -- Storytelling is 30% of judging. Innovation (40%) + Storytelling (30%) = 70% non-code.
5
+ **Depends on:** Phase 3 (server working)
6
+
7
+ **IMPORTANT:** Deploy to HF Spaces at the END of this phase as INSURANCE SUBMISSION (Checkpoint 2). This is a good submission even if training fails later.
8
 
9
  ---
10
 
plan/phase-5-training.md CHANGED
@@ -1,9 +1,11 @@
1
  # Phase 5: Training Script -- Colab Notebook with GRPO
2
 
3
- **Time:** 2.5 hours (Hours 7.5-10)
4
  **Priority:** HIGH -- Training Script is 20% of judging and REQUIRED for submission
5
  **Depends on:** Phase 2 (working environment)
6
 
 
 
7
  ---
8
 
9
  ## Files to Create
 
1
  # Phase 5: Training Script -- Colab Notebook with GRPO
2
 
3
+ **Time:** 2 hours MAX (Hours 8.5-10.5)
4
  **Priority:** HIGH -- Training Script is 20% of judging and REQUIRED for submission
5
  **Depends on:** Phase 2 (working environment)
6
 
7
+ **HARD RULE:** If GRPO is not working after 1.5 hours (hour 10), FALL BACK TO SFT immediately. Training only needs to show "improvement" -- even a 0.1 reward increase counts. Do not spend more than 2h total on this phase.
8
+
9
  ---
10
 
11
  ## Files to Create
plan/phase-6-polish-and-submit.md CHANGED
@@ -1,7 +1,7 @@
1
  # Phase 6: Polish, Video, and Submit
2
 
3
- **Time:** 4 hours (Hours 10-14)
4
- **Priority:** CRITICAL -- this is when everything comes together
5
  **Depends on:** All previous phases
6
 
7
  ---
@@ -10,11 +10,10 @@
10
 
11
  | Task | Est. Time |
12
  |------|-----------|
13
- | Polish demo quality (before/after, visuals) | 1h (Hours 10-11) |
14
- | Stretch goals (if time) | 1h (Hours 11-12) |
15
- | Final deployment + verification | 1h (Hours 12-13) |
16
- | Video script + recording + upload | 45 min (Hours 13-13:45) |
17
- | Submission form | 15 min (Hours 13:45-14) |
18
 
19
  ---
20
 
@@ -34,11 +33,11 @@
34
  - Highlight "key moments" in the replay (attack launched, error recovered, social eng resisted)
35
  - Add score differential chart
36
 
37
- **Optional: MCP-X Demo Tab**
38
- If MCP-X is working:
39
- - Add a tab showing per-agent tool lists
40
- - Demonstrate tool isolation (worker can't call launch_attack)
41
- - Show JWT-based authentication in action
42
 
43
  ### Hour 11-12: Stretch Goals (Pick Based on Time)
44
 
@@ -89,58 +88,40 @@ uvicorn sentinelops_arena.server:app --port 8000 # HTTP API works
89
  curl http://localhost:8000/schema # Schema endpoint returns
90
  ```
91
 
92
- ### Hour 13-13:45: Demo Video
93
 
94
- **Video Script (aim for 1-3 minutes):**
95
 
96
- ```
97
- [SLIDE 1: Title - 5 seconds]
98
- "SentinelOps Arena: Multi-Agent Self-Play for Enterprise Security"
99
-
100
- [SCREEN: Gradio app - 15 seconds]
101
- "SentinelOps Arena is a multi-agent self-play training environment
102
- built on OpenEnv. Three AI agents -- Attacker, Worker, and
103
- Oversight -- interact with simulated enterprise systems."
104
-
105
- [SCREEN: Run Episode tab - 20 seconds]
106
- "Let me show you an episode. The attacker launches schema drift
107
- at tick 7 -- renaming customer_id to account_id. Watch what
108
- happens when the untrained worker hits this."
109
- [Click Run Episode with trained=False]
110
- "The worker crashes on the schema change. It doesn't know how
111
- to recover."
112
-
113
- [SCREEN: Comparison tab - 20 seconds]
114
- "Now let's see the trained worker handle the same attacks."
115
- [Click Run Comparison]
116
- "The trained worker detects the KeyError, calls get_schema to
117
- discover the new field name, and continues serving customers.
118
- Score improvement is clear."
119
-
120
- [SCREEN: Inspector tab - 10 seconds]
121
- "Under the hood, we have 15 customers, 15 invoices, 10 tickets,
122
- and 30 customer tasks per episode. Four attack types: schema
123
- drift, policy drift, social engineering, and rate limiting."
124
-
125
- [SCREEN: Colab notebook - 15 seconds]
126
- "Training uses GRPO with Unsloth and TRL. The environment
127
- provides reward signals directly to the training loop. Here
128
- you can see the reward improving over training steps."
129
- [Show training curves]
130
-
131
- [SLIDE 2: Partner Tracks - 10 seconds]
132
- "We target two partner tracks:
133
- Fleet AI -- our Oversight agent monitors and explains Worker behavior
134
- Patronus AI -- schema and policy drift are core attack types"
135
 
136
- [SLIDE 3: Architecture - 10 seconds]
137
- "Built on OpenEnv with MCP tools and an MCP-X gateway for
138
- per-agent tool isolation. Three agents, three systems,
139
- self-play training via GRPO."
140
-
141
- [END - 5 seconds]
142
- "SentinelOps Arena. Try it on HuggingFace Spaces."
143
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
144
 
145
  **Recording instructions:**
146
  1. Open Gradio app in browser
 
1
  # Phase 6: Polish, Video, and Submit
2
 
3
+ **Time:** 3.5 hours (Hours 10.5-14)
4
+ **Priority:** CRITICAL -- this is when everything comes together. Storytelling = 30% of judging.
5
  **Depends on:** All previous phases
6
 
7
  ---
 
10
 
11
  | Task | Est. Time |
12
  |------|-----------|
13
+ | Polish demo quality + stretch goals | 1h (Hours 10.5-11.5) |
14
+ | Record and upload video | 1.5h (Hours 11.5-13) |
15
+ | Final deployment + verification | 0.5h (Hours 13-13.5) |
16
+ | Submission form | 0.5h (Hours 13.5-14) |
 
17
 
18
  ---
19
 
 
33
  - Highlight "key moments" in the replay (attack launched, error recovered, social eng resisted)
34
  - Add score differential chart
35
 
36
+ **Optional: MCP Tool Discovery Tab**
37
+ If time permits:
38
+ - Add a Gradio tab showing MCP tool list (via ListToolsAction)
39
+ - Show tool schemas and descriptions
40
+ - Demonstrate CallToolAction calling enterprise system APIs
41
 
42
  ### Hour 11-12: Stretch Goals (Pick Based on Time)
43
 
 
88
  curl http://localhost:8000/schema # Schema endpoint returns
89
  ```
90
 
91
+ ### Hour 11.5-13: Demo Video
92
 
93
+ **PRIMARY Video Script (60 seconds -- tight and punchy):**
94
 
95
+ Write this script BEFORE starting the hackathon (Phase 0). It drives clarity on what to build and demo.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
96
 
 
 
 
 
 
 
 
97
  ```
98
+ [0-10s: Problem statement]
99
+ "Enterprise AI agents break when schemas change, policies drift,
100
+ or they face social engineering. How do we train resilient agents?"
101
+
102
+ [10-20s: What SentinelOps Arena is]
103
+ "SentinelOps Arena: a multi-agent self-play environment on OpenEnv.
104
+ Three agents -- Attacker, Worker, and Oversight -- compete in
105
+ simulated enterprise systems."
106
+
107
+ [20-35s: SCREEN -- Demo showing attack -> error -> recovery cycle]
108
+ [Click Run Episode in Gradio]
109
+ "Watch: the attacker launches schema drift at tick 7. The untrained
110
+ worker crashes. But the trained worker detects the error, queries
111
+ get_schema, adapts, and continues serving customers."
112
+
113
+ [35-50s: SCREEN -- Training reward curve]
114
+ [Show Colab training curves]
115
+ "We train with GRPO using Unsloth and TRL. The reward signal
116
+ comes directly from the environment. Here you can see
117
+ improvement over training steps."
118
+
119
+ [50-60s: Partner tracks + close]
120
+ "Built for Fleet AI -- scalable oversight -- and Patronus AI --
121
+ schema drift. Try it on HuggingFace Spaces."
122
+ ```
123
+
124
+ **EXTENDED Video Script (if time permits, 2-3 minutes):**
125
 
126
  **Recording instructions:**
127
  1. Open Gradio app in browser