Spaces:
Running
Running
Phase 3: MCP + OpenEnv HTTP Server
Time: 0.5 hours (Hours 6-6.5) Priority: MEDIUM -- MCPEnvironment did most of the work in Phase 2 Depends on: Phase 2 (working environment with MCP tools)
KEY CHANGE: MCPEnvironment handles MCP tool routing automatically. Phase 3 is now just creating the HTTP server entry point and verifying everything works end-to-end. MCP-X gateway is CUT.
Files to Create
| File | Purpose | Est. Time |
|---|---|---|
sentinelops_arena/server.py |
create_app() HTTP server entry point |
10 min |
| Verify MCP tools via HTTP | End-to-end test | 10 min |
| Verify WebSocket + MCP | Integration test | 10 min |
Step-by-Step Build Instructions
Step 1: server.py -- OpenEnv HTTP Server (10 min)
This is trivial -- follow the hackathon_env template exactly.
# sentinelops_arena/server.py
"""
HTTP server for SentinelOps Arena.
Endpoints:
POST /reset -- Reset environment
POST /step -- Execute an action (including ListToolsAction, CallToolAction)
GET /state -- Get current state
GET /schema -- Get action/observation schemas
WS /ws -- WebSocket for persistent sessions (supports /mcp)
The MCPEnvironment base class handles MCP tool routing automatically.
Agents can discover tools via ListToolsAction and call them via CallToolAction.
Usage:
uvicorn sentinelops_arena.server:app --host 0.0.0.0 --port 8000
"""
from openenv.core.env_server.http_server import create_app
from .models import SentinelAction, SentinelObservation
from .environment import SentinelOpsArena
app = create_app(
SentinelOpsArena,
SentinelAction,
SentinelObservation,
env_name="sentinelops_arena",
max_concurrent_envs=5,
)
def main(host: str = "0.0.0.0", port: int = 8000):
import uvicorn
uvicorn.run(app, host=host, port=port)
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--port", type=int, default=8000)
args = parser.parse_args()
main(port=args.port)
Step 2: Verify HTTP + MCP Integration (10 min)
# Start server
uvicorn sentinelops_arena.server:app --port 8000 &
# Test reset
curl -X POST http://localhost:8000/reset -H "Content-Type: application/json" -d '{}'
# Test step (regular action)
curl -X POST http://localhost:8000/step -H "Content-Type: application/json" \
-d '{"action": {"agent": "attacker", "action_type": "pass"}}'
# Test step (MCP list_tools -- auto-routed by MCPEnvironment)
curl -X POST http://localhost:8000/step -H "Content-Type: application/json" \
-d '{"action": {"type": "list_tools"}}'
# Should return available MCP tools
# Test step (MCP call_tool -- auto-routed by MCPEnvironment)
curl -X POST http://localhost:8000/step -H "Content-Type: application/json" \
-d '{"action": {"type": "call_tool", "tool_name": "lookup_customer", "arguments": {"customer_id": "C000"}}}'
# Should return customer data
# Test state
curl http://localhost:8000/state
# Test schema
curl http://localhost:8000/schema
kill %1
Step 3: Verify WebSocket MCP Path (10 min)
# Quick WebSocket test
import asyncio
import json
import websockets
async def test_ws():
async with websockets.connect("ws://localhost:8000/ws") as ws:
# Reset
await ws.send(json.dumps({"type": "reset", "data": {"seed": 42}}))
resp = json.loads(await ws.recv())
print(f"Reset: {resp['type']}")
# MCP via WebSocket
await ws.send(json.dumps({
"type": "mcp",
"data": {"method": "tools/list", "params": {}, "id": 1}
}))
resp = json.loads(await ws.recv())
print(f"MCP tools via WS: {resp}")
asyncio.run(test_ws())
What MCPEnvironment Gives Us For Free
| Feature | How |
|---|---|
| MCP tool discovery | ListToolsAction -> returns all tools with schemas |
| MCP tool invocation | CallToolAction(tool_name, arguments) -> calls FastMCP tool |
| Reserved name validation | Rejects tools named reset, step, state, close |
| Timeout handling | Configurable timeout on tool calls |
| Error categorization | ToolError with types: execution_error, invalid_args, tool_not_found, timeout |
| WebSocket MCP path | /ws endpoint supports type: "mcp" messages |
| Async support | _run_async_safely() handles both sync and async contexts |
What We DON'T Need (CUT)
| Removed | Reason |
|---|---|
mcp_tools.py |
MCP tools defined inside environment.py via FastMCP |
mcp-x/ directory |
MCP-X gateway CUT -- MCPEnvironment handles tool exposure |
config.toml |
No MCP-X = no per-agent access control config |
run_server.py |
Single server is enough |
| Per-agent JWT tokens | Nice-to-have, not needed for demo/judging |
VERIFY
Test 1: HTTP Server starts
uvicorn sentinelops_arena.server:app --port 8000
# Should start without errors
# Should show "Uvicorn running on http://0.0.0.0:8000"
Test 2: All endpoints return valid JSON
# Reset -> Observation JSON
# Step -> Observation JSON
# State -> State JSON
# Schema -> Action/Observation/State schemas
Test 3: MCP tools discoverable via HTTP
# POST /step with ListToolsAction -> list of tools
# Verify: lookup_customer, issue_refund, get_schema, launch_attack etc. all present
# Verify: no reserved names (reset, step, state, close)
Test 4: MCP tools callable via HTTP
# POST /step with CallToolAction -> tool result
# Call lookup_customer("C000") -> customer data
# Call get_schema("crm") -> field list
# Call get_current_policy("refund") -> policy values
DEBUG: Common Issues
| Issue | Cause | Fix |
|---|---|---|
Port 8000 already in use |
Previous server running | kill $(lsof -t -i:8000) |
create_app() fails with type error |
Wrong argument types | Pass class (not instance), Action class, Observation class |
| MCP tools not showing up | Tools defined after super().__init__() |
Define tools BEFORE calling super().__init__(mcp) |
ValueError: reserved names |
Tool named reset or step |
Rename the tool |
| WebSocket MCP not working | Wrong message format | Use {"type": "mcp", "data": {"method": "tools/list", ...}} |
ListToolsAction not recognized |
create_app doesn't know about MCP types |
May need to pass both SentinelAction and MCP action types to create_app |
EXIT CRITERIA
-
uvicorn sentinelops_arena.server:appstarts without errors - HTTP
/reset,/step,/state,/schemareturn valid JSON -
ListToolsActionvia/stepreturns all enterprise system tools -
CallToolActionvia/stepsuccessfully calls tools - WebSocket
/wsendpoint accepts connections
ROLLBACK PLAN
Phase 3 is already minimal. If it takes longer than 30 minutes:
- Skip WebSocket verification -- HTTP-only is fine for demo
- Skip schema endpoint check -- not needed for judging
- If
create_app()fails entirely -- serve the Gradio app directly without the OpenEnv HTTP layer. The environment still works via direct Python calls.
Do NOT cut: server.py with create_app(). This is required for HF Spaces deployment.