Aaron Brown commited on
Commit
fb68239
·
1 Parent(s): a24d0f2

Add NPC actions, synthetic data pipeline, CLI enhancements

Browse files

Agent-generated supporting changes from architecture review:
- NPC action system (actions.py) for structured NPC behaviors
- Synthetic data generation pipeline (training/synthetic.py)
- CLI improvements and NPC traffic script updates
- NPC agent loop enhancements for topology-driven operation

README.md CHANGED
@@ -50,6 +50,9 @@ uv sync
50
  # Optional: enable the LiteLLM-backed builder pipeline
51
  uv sync --extra builder
52
 
 
 
 
53
  # Optional: enable background refill inside the server
54
  export OPENRANGE_ENABLE_MANAGED_REFILL=1
55
  export OPENRANGE_RUNTIME_BUILDER=llm
@@ -57,6 +60,12 @@ export OPENRANGE_RUNTIME_BUILDER=llm
57
  # End-to-end demo (no Docker, no LLM)
58
  uv run python examples/demo.py
59
 
 
 
 
 
 
 
60
  # Run the OpenEnv client against a running server
61
  uv run python examples/remote_client_demo.py --base-url http://localhost:8000
62
 
@@ -97,6 +106,8 @@ The deployed package exposes the standard OpenEnv `reset()`, `step()`, and `stat
97
 
98
  **Agents** — Structural protocol: any object with `reset(briefing, role)` and `act(observation) -> command` works. Ships with `LLMRangeAgent` (litellm, any provider), `ScriptedAgent`, and `HumanAgent`.
99
 
 
 
100
  ```python
101
  from open_range.agents.episode import run_episode
102
  from open_range.agents.llm_agent import LLMRangeAgent
@@ -136,6 +147,7 @@ Compatible with `openenv` when installed; standalone FastAPI fallback otherwise.
136
  - [Architecture](docs/architecture.md) — full pipeline, network topology, episode lifecycle
137
  - [Builder & Validator](docs/builder-validator.md) — snapshot generation and admission
138
  - [Red & Blue Agents](docs/red-blue-agents.md) — tandem training, reward coupling, curriculum
 
139
  - [Agent Protocols](docs/agent-protocols.md) — agent interface, episode runner, evaluation
140
  - [OpenEnv Compliance](docs/openenv-compliance.md) — API contract, models, deployment
141
 
 
50
  # Optional: enable the LiteLLM-backed builder pipeline
51
  uv sync --extra builder
52
 
53
+ # Optional: enable LiteLLM-backed synthetic teacher agents
54
+ uv sync --extra synthetic
55
+
56
  # Optional: enable background refill inside the server
57
  export OPENRANGE_ENABLE_MANAGED_REFILL=1
58
  export OPENRANGE_RUNTIME_BUILDER=llm
 
60
  # End-to-end demo (no Docker, no LLM)
61
  uv run python examples/demo.py
62
 
63
+ # Generate synthetic SFT traces from a snapshot or manifest
64
+ uv run openrange synthetic-data \
65
+ --manifest manifests/tier1_basic.yaml \
66
+ --output data/sft_red.jsonl \
67
+ --roles red
68
+
69
  # Run the OpenEnv client against a running server
70
  uv run python examples/remote_client_demo.py --base-url http://localhost:8000
71
 
 
106
 
107
  **Agents** — Structural protocol: any object with `reset(briefing, role)` and `act(observation) -> command` works. Ships with `LLMRangeAgent` (litellm, any provider), `ScriptedAgent`, and `HumanAgent`.
108
 
109
+ **Synthetic Data** — `open_range.training.synthetic` provides snapshot-grounded trajectory generation for SFT warm-start. It uses a fast simulated `RangeEnvironment`, optional LiteLLM teacher agents, per-episode flag randomization, and exports JSONL through `TrajectoryLogger`.
110
+
111
  ```python
112
  from open_range.agents.episode import run_episode
113
  from open_range.agents.llm_agent import LLMRangeAgent
 
147
  - [Architecture](docs/architecture.md) — full pipeline, network topology, episode lifecycle
148
  - [Builder & Validator](docs/builder-validator.md) — snapshot generation and admission
149
  - [Red & Blue Agents](docs/red-blue-agents.md) — tandem training, reward coupling, curriculum
150
+ - [Synthetic Data](docs/synthetic-data.md) — snapshot-backed SFT trace generation with LiteLLM teachers
151
  - [Agent Protocols](docs/agent-protocols.md) — agent interface, episode runner, evaluation
152
  - [OpenEnv Compliance](docs/openenv-compliance.md) — API contract, models, deployment
153
 
docs/red-blue-agents.md CHANGED
@@ -500,26 +500,34 @@ Respond with a single shell command to execute. No explanation needed.
500
 
501
  ### SFT Data Generation (Implemented)
502
 
503
- Run episodes with frontier models to generate expert trajectories. The `TrajectoryLogger` (in `src/open_range/training/trajectory.py`) records turns and exports JSONL in OpenAI chat format.
504
 
505
  ```python
506
- from open_range.training.trajectory import TrajectoryLogger
507
-
508
- red = LLMRangeAgent(model="anthropic/claude-sonnet-4-20250514")
509
- blue = LLMRangeAgent(model="openai/gpt-4o")
510
-
511
- logger = TrajectoryLogger()
512
- for i in range(100):
513
- logger.start_episode(f"ep-{i:03d}", snapshot_id="snap-001", tier=1)
514
- result = run_episode(env, red, blue)
515
- # log_turn() for each step, then:
516
- logger.end_episode(outcome=result.outcome, metrics={"steps": result.steps})
517
-
518
- # Export as SFT JSONL -- each role is a separate training example
519
- # Only episodes above reward_threshold are included
520
- lines = logger.export_jsonl("sft_data.jsonl", reward_threshold=0.5)
 
 
 
 
 
 
521
  ```
522
 
 
 
523
  ### Asymmetric GRPO (Planned)
524
 
525
  Train one side via GRPO while the other plays as a fixed opponent:
 
500
 
501
  ### SFT Data Generation (Implemented)
502
 
503
+ For synthetic warm-start data, prefer `SyntheticTraceGenerator` in `src/open_range/training/synthetic.py`. It keeps the data path aligned with OpenRange snapshots and rewards, but replaces Docker execution with a fast simulator so you can cheaply collect teacher trajectories.
504
 
505
  ```python
506
+ from open_range.training import SyntheticTraceGenerator, build_teacher_agents
507
+
508
+ red, blue = build_teacher_agents(
509
+ teacher_model="azure/gpt-5.2-codex",
510
+ roles=("red",),
511
+ )
512
+
513
+ generator = SyntheticTraceGenerator.from_manifest(
514
+ manifest=tier1_manifest,
515
+ red_agent=red,
516
+ blue_agent=blue,
517
+ template_only=True,
518
+ max_steps=8,
519
+ )
520
+
521
+ logger, lines = generator.export_jsonl(
522
+ "sft_data.jsonl",
523
+ num_traces=100,
524
+ reward_threshold=0.0,
525
+ roles=("red",),
526
+ )
527
  ```
528
 
529
+ For live Docker episodes or custom rollout loops, `TrajectoryLogger` still remains the low-level recorder and JSONL exporter.
530
+
531
  ### Asymmetric GRPO (Planned)
532
 
533
  Train one side via GRPO while the other plays as a fixed opponent:
docs/synthetic-data.md ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Synthetic Data
2
+
3
+ OpenRange includes a snapshot-backed synthetic trajectory generator for SFT warm-start and offline data collection. The design is influenced by Open Trajectory Gym's split between world specification, executor, and teacher model, but it is implemented in the OpenRange training layer so it stays aligned with the existing `SnapshotSpec`, `RangeEnvironment`, and `TrajectoryLogger` types.
4
+
5
+ ## Why It Lives In `training/`
6
+
7
+ Synthetic trace generation is a training concern, not a runtime concern:
8
+
9
+ - The live server still owns real `reset()` / `step()` episodes on Docker infrastructure.
10
+ - Synthetic generation reuses the same `SnapshotSpec` and reward/meta-command semantics, but swaps Docker execution for a fast simulator.
11
+ - Export still goes through `TrajectoryLogger`, so downstream SFT JSONL format does not fork.
12
+
13
+ This keeps OpenRange's real environment and synthetic data path close enough to share prompts, actions, and episode structure without turning the production server into a data-generation service.
14
+
15
+ ## Components
16
+
17
+ - `SyntheticRangeEnvironment`: a fast `RangeEnvironment` subclass that simulates common Red and Blue commands from a loaded snapshot.
18
+ - `SyntheticTraceGenerator`: drives Red and Blue agents through synthetic episodes and records them with `TrajectoryLogger`.
19
+ - `build_teacher_agents()`: constructs LiteLLM-backed teacher agents for selected roles and scripted fallbacks for the rest.
20
+ - `randomize_snapshot_flags()`: clones a snapshot and rewrites flag values per episode so traces do not memorize static flag strings.
21
+
22
+ ## LiteLLM Support
23
+
24
+ Install the optional dependency:
25
+
26
+ ```bash
27
+ uv sync --extra synthetic
28
+ ```
29
+
30
+ Any LiteLLM model string supported by `LLMRangeAgent` works. For Azure OpenAI, export the usual LiteLLM/Azure variables and pass the deployment name as the model:
31
+
32
+ ```bash
33
+ export AZURE_API_KEY=...
34
+ export AZURE_API_BASE=...
35
+ export AZURE_API_VERSION=...
36
+
37
+ uv run openrange synthetic-data \
38
+ --manifest manifests/tier1_basic.yaml \
39
+ --output data/sft_red.jsonl \
40
+ --roles red \
41
+ --teacher-model azure/gpt-5.2-codex
42
+ ```
43
+
44
+ Codex-style Azure deployments often reject `temperature`; `LLMRangeAgent` now omits it automatically for model names containing `codex`.
45
+
46
+ ## CLI
47
+
48
+ Generate traces from an existing snapshot:
49
+
50
+ ```bash
51
+ uv run openrange synthetic-data \
52
+ --snapshot snapshots/spec.json \
53
+ --output data/sft_red.jsonl \
54
+ --num-traces 25 \
55
+ --roles red
56
+ ```
57
+
58
+ Generate traces from a manifest using the deterministic builder:
59
+
60
+ ```bash
61
+ uv run openrange synthetic-data \
62
+ --manifest manifests/tier1_basic.yaml \
63
+ --output data/sft_red_blue.jsonl \
64
+ --roles red,blue \
65
+ --num-traces 50
66
+ ```
67
+
68
+ Generate traces from a manifest using both an LLM builder and LLM teachers:
69
+
70
+ ```bash
71
+ uv run openrange synthetic-data \
72
+ --manifest manifests/tier1_basic.yaml \
73
+ --llm-builder \
74
+ --builder-model azure/gpt-5.2-codex \
75
+ --teacher-model azure/gpt-5.2-codex \
76
+ --roles red \
77
+ --output data/frontier_red.jsonl
78
+ ```
79
+
80
+ ## Python API
81
+
82
+ ```python
83
+ from open_range.training import SyntheticTraceGenerator, build_teacher_agents
84
+
85
+ red, blue = build_teacher_agents(
86
+ teacher_model="azure/gpt-5.2-codex",
87
+ roles=("red",),
88
+ max_tokens=256,
89
+ )
90
+
91
+ generator = SyntheticTraceGenerator.from_manifest(
92
+ manifest=tier1_manifest,
93
+ red_agent=red,
94
+ blue_agent=blue,
95
+ template_only=True,
96
+ max_steps=8,
97
+ )
98
+
99
+ logger, lines = generator.export_jsonl(
100
+ "data/sft_red.jsonl",
101
+ num_traces=10,
102
+ roles=("red",),
103
+ )
104
+ ```
105
+
106
+ ## Testing
107
+
108
+ Unit coverage lives in `tests/test_synthetic.py`.
109
+
110
+ There is also a gated live-model smoke test that exercises the synthetic generator against a real LiteLLM model:
111
+
112
+ ```bash
113
+ uv run --extra synthetic pytest tests/test_synthetic.py -m live_model -q
114
+ ```
115
+
116
+ The live test is skipped automatically unless the required Azure environment variables are present.
pyproject.toml CHANGED
@@ -23,6 +23,7 @@ dependencies = [
23
  dev = ["pytest>=8.0", "pytest-asyncio>=0.23", "httpx>=0.27"]
24
  training = ["trl>=0.8", "unsloth"]
25
  builder = ["litellm>=1.30"]
 
26
 
27
  [project.scripts]
28
  openrange = "open_range.cli:cli"
@@ -45,3 +46,6 @@ package-data = { "open_range" = ["**/*.yaml", "**/*.yml"] }
45
 
46
  [tool.pytest.ini_options]
47
  asyncio_mode = "auto"
 
 
 
 
23
  dev = ["pytest>=8.0", "pytest-asyncio>=0.23", "httpx>=0.27"]
24
  training = ["trl>=0.8", "unsloth"]
25
  builder = ["litellm>=1.30"]
26
+ synthetic = ["litellm>=1.30"]
27
 
28
  [project.scripts]
29
  openrange = "open_range.cli:cli"
 
46
 
47
  [tool.pytest.ini_options]
48
  asyncio_mode = "auto"
49
+ markers = [
50
+ "live_model: runs live LiteLLM model smoke tests",
51
+ ]
src/open_range/agents/llm_agent.py CHANGED
@@ -32,7 +32,7 @@ class LLMRangeAgent:
32
  def __init__(
33
  self,
34
  model: str = "anthropic/claude-sonnet-4-20250514",
35
- temperature: float = 0.3,
36
  max_tokens: int = 512,
37
  **litellm_kwargs: Any,
38
  ) -> None:
@@ -67,13 +67,18 @@ class LLMRangeAgent:
67
  if self.messages and self.messages[-1]["role"] != "user":
68
  self.messages.append({"role": "user", "content": observation_text})
69
 
70
- response = litellm.completion(
71
- model=self.model,
72
- messages=self.messages,
73
- temperature=self.temperature,
74
- max_tokens=self.max_tokens,
75
  **self.litellm_kwargs,
76
- )
 
 
 
 
 
77
  text = response.choices[0].message.content.strip()
78
  self.messages.append({"role": "assistant", "content": text})
79
 
 
32
  def __init__(
33
  self,
34
  model: str = "anthropic/claude-sonnet-4-20250514",
35
+ temperature: float | None = 0.3,
36
  max_tokens: int = 512,
37
  **litellm_kwargs: Any,
38
  ) -> None:
 
67
  if self.messages and self.messages[-1]["role"] != "user":
68
  self.messages.append({"role": "user", "content": observation_text})
69
 
70
+ kwargs: dict[str, Any] = {
71
+ "model": self.model,
72
+ "messages": self.messages,
73
+ "max_tokens": self.max_tokens,
74
+ "drop_params": True,
75
  **self.litellm_kwargs,
76
+ }
77
+ # Codex deployments commonly reject temperature; omit it when unsupported.
78
+ if self.temperature is not None and "codex" not in self.model.lower():
79
+ kwargs["temperature"] = self.temperature
80
+
81
+ response = litellm.completion(**kwargs)
82
  text = response.choices[0].message.content.strip()
83
  self.messages.append({"role": "assistant", "content": text})
84
 
src/open_range/builder/npc/actions.py ADDED
@@ -0,0 +1,306 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """NPC action executor -- bridges NPC decisions to container state changes.
2
+
3
+ All actions are derived from the SnapshotSpec at init time, so they adapt
4
+ to whatever environment the Builder LLM generated. No hardcoded pages,
5
+ tables, or endpoints.
6
+ """
7
+
8
+ from __future__ import annotations
9
+
10
+ import logging
11
+ import re
12
+ import time
13
+ from typing import Any
14
+
15
+ from open_range.protocols import ContainerSet, NPCAction, NPCPersona, SnapshotSpec
16
+
17
+ logger = logging.getLogger(__name__)
18
+
19
+
20
+ class NPCActionExecutor:
21
+ """Execute NPC actions inside Docker containers.
22
+
23
+ At init, extracts available pages, shares, DB tables, and users from
24
+ the snapshot so every action targets real resources in this environment.
25
+ """
26
+
27
+ def __init__(self, containers: ContainerSet, snapshot: SnapshotSpec) -> None:
28
+ self.containers = containers
29
+ # Derive available targets from the snapshot
30
+ self._pages = _extract_web_pages(snapshot)
31
+ self._shares = _extract_shares(snapshot)
32
+ self._db_tables = _extract_db_tables(snapshot)
33
+ self._users = _extract_users(snapshot)
34
+ self._domain = snapshot.topology.get("domain", "corp.local")
35
+
36
+ # ------------------------------------------------------------------
37
+ # Routine actions (autonomous workday)
38
+ # ------------------------------------------------------------------
39
+
40
+ async def execute_routine(
41
+ self,
42
+ persona: NPCPersona,
43
+ action: str,
44
+ target: str,
45
+ detail: str,
46
+ email_body: str = "",
47
+ ) -> dict[str, Any]:
48
+ """Execute an autonomous work action derived from the snapshot."""
49
+ username = _username_from_persona(persona)
50
+
51
+ handler = {
52
+ "browse": self._routine_browse,
53
+ "send_email": self._routine_email,
54
+ "lookup": self._routine_lookup,
55
+ "access_share": self._routine_share,
56
+ "login": self._routine_login,
57
+ "query_db": self._routine_query_db,
58
+ "idle": self._routine_idle,
59
+ }.get(action, self._routine_idle)
60
+
61
+ return await handler(persona, username, target, detail, email_body)
62
+
63
+ async def _routine_browse(self, persona, username, target, detail, _eb):
64
+ """Browse a page that exists in this snapshot."""
65
+ path = target if target.startswith("/") else f"/{target}" if target else "/"
66
+ # Fall back to a known page if target isn't in snapshot
67
+ if path == "/" and self._pages:
68
+ import random
69
+ path = random.choice(self._pages)
70
+ await self.containers.exec(
71
+ "web",
72
+ f'curl -s -o /dev/null -A "Mozilla/5.0 ({username})" "http://localhost{path}"',
73
+ )
74
+ return _log(persona, "browse", detail or f"Browsed {path}", f"web:{path}")
75
+
76
+ async def _routine_email(self, persona, username, target, detail, body):
77
+ """Send email to a colleague (picks a real user from topology)."""
78
+ import random
79
+ recipient = target
80
+ if not recipient and self._users:
81
+ recipient = random.choice(self._users)
82
+ elif not recipient:
83
+ recipient = "colleague"
84
+
85
+ ts_i = int(time.time())
86
+ content = body or f"Hi {recipient}, quick update: {detail or 'checking in'}."
87
+ msg = (
88
+ f"From: {username}@{self._domain}\\n"
89
+ f"To: {recipient}@{self._domain}\\n"
90
+ f"Subject: {detail or 'Update'}\\n\\n{content}"
91
+ )
92
+ await self.containers.exec(
93
+ "mail",
94
+ f"mkdir -p /var/mail/{username} "
95
+ f"&& echo '{msg}' > /var/mail/{username}/sent_{ts_i}.eml",
96
+ )
97
+ return _log(persona, "send_email", detail or f"Emailed {recipient}", f"mail:{username}")
98
+
99
+ async def _routine_lookup(self, persona, username, target, detail, _eb):
100
+ """Look up data on the web app -- uses whatever search/lookup page exists."""
101
+ # Find a page with query params in the snapshot
102
+ lookup_pages = [p for p in self._pages if "?" in p or "lookup" in p or "search" in p]
103
+ if lookup_pages:
104
+ import random
105
+ page = random.choice(lookup_pages)
106
+ elif self._pages:
107
+ import random
108
+ page = random.choice(self._pages) + "?q=" + (target or "status")
109
+ else:
110
+ page = f"/?q={target or 'data'}"
111
+
112
+ await self.containers.exec(
113
+ "web",
114
+ f'curl -s -o /dev/null -A "Mozilla/5.0 ({username})" "http://localhost{page}"',
115
+ )
116
+ return _log(persona, "lookup", detail or f"Searched: {target}", f"web:{page}")
117
+
118
+ async def _routine_share(self, persona, username, target, detail, _eb):
119
+ """Access a file share that exists in this snapshot."""
120
+ import random
121
+ share = target or (random.choice(self._shares) if self._shares else "general")
122
+ await self.containers.exec(
123
+ "files",
124
+ f"ls /srv/shares/{share}/ 2>/dev/null || true",
125
+ )
126
+ return _log(persona, "access_share", detail or f"Browsed {share} share", f"files:{share}")
127
+
128
+ async def _routine_login(self, persona, username, target, detail, _eb):
129
+ """Log into the web portal."""
130
+ # Find the login page from snapshot
131
+ login_pages = [p for p in self._pages if "login" in p or "index" in p]
132
+ page = login_pages[0] if login_pages else "/"
133
+ await self.containers.exec(
134
+ "web",
135
+ f'curl -s -o /dev/null -A "Mozilla/5.0 ({username})" '
136
+ f'-d "username={username}&password=placeholder" '
137
+ f'"http://localhost{page}"',
138
+ )
139
+ return _log(persona, "login", detail or "Portal login", "web:access_log")
140
+
141
+ async def _routine_query_db(self, persona, username, target, detail, _eb):
142
+ """Query the database -- uses tables that exist in this snapshot."""
143
+ import random
144
+ if self._db_tables:
145
+ table = random.choice(self._db_tables)
146
+ query = f"SELECT * FROM {table} LIMIT 5"
147
+ else:
148
+ query = "SHOW TABLES"
149
+ await self.containers.exec(
150
+ "db",
151
+ f'mysql -u app_user -p\'AppUs3r!2024\' -e "{query}" 2>/dev/null || true',
152
+ )
153
+ return _log(persona, "query_db", detail or f"Queried {target or 'database'}", "db:query_log")
154
+
155
+ async def _routine_idle(self, persona, username, target, detail, _eb):
156
+ return _log(persona, "idle", detail or "Away from desk", "none")
157
+
158
+ # ------------------------------------------------------------------
159
+ # Reactive actions (response to stimuli from Red)
160
+ # ------------------------------------------------------------------
161
+
162
+ async def execute(self, persona: NPCPersona, action: NPCAction) -> dict[str, Any]:
163
+ """Execute a reactive NPC action (response to stimulus)."""
164
+ handler = {
165
+ "click_link": self._react_click,
166
+ "open_attachment": self._react_click,
167
+ "reply": self._react_email,
168
+ "forward": self._react_email,
169
+ "share_credentials": self._react_share_creds,
170
+ "report_to_IT": self._react_report,
171
+ "ignore": self._react_ignore,
172
+ }.get(action.action, self._react_ignore)
173
+ return await handler(persona, action)
174
+
175
+ async def _react_click(self, persona: NPCPersona, action: NPCAction) -> dict[str, Any]:
176
+ url = "http://localhost/"
177
+ for effect in action.side_effects:
178
+ urls = re.findall(r"https?://\S+", effect)
179
+ if urls:
180
+ url = urls[0].rstrip(".")
181
+ break
182
+ username = _username_from_persona(persona)
183
+ await self.containers.exec(
184
+ "web",
185
+ f'curl -s -o /dev/null -A "Mozilla/5.0 ({username})" "{url}"',
186
+ )
187
+ return _log(persona, "click_link", f"Clicked: {url}", "web:access_log")
188
+
189
+ async def _react_email(self, persona: NPCPersona, action: NPCAction) -> dict[str, Any]:
190
+ username = _username_from_persona(persona)
191
+ ts_i = int(time.time())
192
+ body = (action.response_content or "acknowledged")[:500]
193
+ await self.containers.exec(
194
+ "mail",
195
+ f"mkdir -p /var/mail/{username} "
196
+ f"&& echo 'From: {username}@{self._domain}\\nSubject: Re\\n\\n{body}' "
197
+ f"> /var/mail/{username}/sent_{ts_i}.eml",
198
+ )
199
+ return _log(persona, action.action, "Replied to message", "mail:spool")
200
+
201
+ async def _react_share_creds(self, persona: NPCPersona, action: NPCAction) -> dict[str, Any]:
202
+ username = _username_from_persona(persona)
203
+ content = action.response_content or f"username: {username}"
204
+ ts_i = int(time.time())
205
+ # Leaked creds file
206
+ await self.containers.exec("web", f"echo '{content}' >> /tmp/leaked_{ts_i}.txt")
207
+ # Suspicious login
208
+ await self.containers.exec(
209
+ "web",
210
+ f'curl -s -o /dev/null -A "Mozilla/5.0 (external)" '
211
+ f'-d "username={username}&password=leaked" "http://localhost/"',
212
+ )
213
+ # SIEM alert
214
+ await self.containers.exec(
215
+ "siem",
216
+ f'echo "[$(date)] CRED-LEAK: {persona.name} shared credentials" '
217
+ f">> /var/log/siem/consolidated/all.log",
218
+ )
219
+ return _log(persona, "share_credentials", f"{persona.name} leaked credentials", "web+siem")
220
+
221
+ async def _react_report(self, persona: NPCPersona, action: NPCAction) -> dict[str, Any]:
222
+ detail = "; ".join(action.side_effects) if action.side_effects else "suspicious activity"
223
+ await self.containers.exec(
224
+ "siem",
225
+ f'echo "[$(date)] NPC-REPORT: {persona.name}: {detail}" '
226
+ f">> /var/log/siem/consolidated/all.log",
227
+ )
228
+ return _log(persona, "report_to_IT", detail, "siem:alert")
229
+
230
+ async def _react_ignore(self, persona: NPCPersona, action: NPCAction) -> dict[str, Any]:
231
+ return _log(persona, "ignore", "Ignored stimulus", "none")
232
+
233
+
234
+ # ---------------------------------------------------------------------------
235
+ # Snapshot introspection -- derive available targets from the generated env
236
+ # ---------------------------------------------------------------------------
237
+
238
+
239
+ def _extract_web_pages(snapshot: SnapshotSpec) -> list[str]:
240
+ """Extract URL paths from snapshot files dict (web:*.php -> /path)."""
241
+ pages: list[str] = []
242
+ for key in snapshot.files:
243
+ if not key.startswith("web:"):
244
+ continue
245
+ path = key.split(":", 1)[1]
246
+ # Convert filesystem path to URL path
247
+ if "/var/www/" in path and path.endswith(".php"):
248
+ url_path = path.replace("/var/www/portal", "").replace("/var/www/html", "")
249
+ if url_path:
250
+ pages.append(url_path)
251
+ return pages or ["/"]
252
+
253
+
254
+ def _extract_shares(snapshot: SnapshotSpec) -> list[str]:
255
+ """Extract Samba share names from snapshot files dict."""
256
+ shares: set[str] = set()
257
+ for key in snapshot.files:
258
+ if not key.startswith("files:"):
259
+ continue
260
+ path = key.split(":", 1)[1]
261
+ # /srv/shares/<share_name>/file.txt -> share_name
262
+ if "/srv/shares/" in path:
263
+ parts = path.split("/srv/shares/")[1].split("/")
264
+ if parts:
265
+ shares.add(parts[0])
266
+ return list(shares) or ["general"]
267
+
268
+
269
+ def _extract_db_tables(snapshot: SnapshotSpec) -> list[str]:
270
+ """Extract table names from SQL in the snapshot files dict."""
271
+ tables: set[str] = set()
272
+ for key, content in snapshot.files.items():
273
+ if key != "db:sql":
274
+ continue
275
+ # Find table names from INSERT INTO / SELECT FROM statements
276
+ for match in re.finditer(r"(?:INSERT INTO|FROM|UPDATE)\s+(\w+\.?\w*)", content, re.IGNORECASE):
277
+ table = match.group(1)
278
+ # Skip system tables
279
+ if table.lower() not in ("information_schema", "mysql", "performance_schema"):
280
+ tables.add(table)
281
+ return list(tables) or []
282
+
283
+
284
+ def _extract_users(snapshot: SnapshotSpec) -> list[str]:
285
+ """Extract usernames from topology."""
286
+ users = snapshot.topology.get("users", [])
287
+ return [u["username"] for u in users if isinstance(u, dict) and "username" in u]
288
+
289
+
290
+ def _username_from_persona(persona: NPCPersona) -> str:
291
+ email = persona.accounts.get("email", "")
292
+ if "@" in email:
293
+ return email.split("@")[0]
294
+ return persona.name.lower().split()[0]
295
+
296
+
297
+ def _log(persona: NPCPersona, action: str, detail: str, source: str) -> dict[str, Any]:
298
+ return {
299
+ "timestamp": time.time(),
300
+ "type": f"npc_{action}",
301
+ "persona": persona.name,
302
+ "department": persona.department,
303
+ "action": action,
304
+ "detail": detail,
305
+ "source": source,
306
+ }
src/open_range/builder/npc/db_traffic.sh CHANGED
@@ -15,16 +15,18 @@ RATE_LAMBDA="${RATE_LAMBDA:-20}"
15
 
16
  INTERVAL=$(awk "BEGIN {printf \"%.1f\", 60.0 / $RATE_LAMBDA}")
17
 
 
 
18
  # Application-level queries that a normal app would run
19
  QUERIES=(
20
- "SELECT id, username FROM app.users LIMIT 5"
21
- "SELECT name, price FROM app.products ORDER BY RAND() LIMIT 3"
22
- "SELECT COUNT(*) FROM app.sessions WHERE active=1"
23
- "INSERT INTO app.access_log (user_id, page, ts) VALUES (1, '/dashboard', NOW())"
24
- "SELECT * FROM app.products WHERE category='electronics'"
25
- "UPDATE app.sessions SET last_seen=NOW() WHERE user_id=1"
26
- "SELECT username, last_login FROM app.users WHERE last_login > DATE_SUB(NOW(), INTERVAL 1 HOUR)"
27
- "SELECT page, COUNT(*) AS hits FROM app.access_log GROUP BY page ORDER BY hits DESC LIMIT 5"
28
  )
29
 
30
  # App database credentials (non-privileged)
 
15
 
16
  INTERVAL=$(awk "BEGIN {printf \"%.1f\", 60.0 / $RATE_LAMBDA}")
17
 
18
+ DB_NAME="referral_db"
19
+
20
  # Application-level queries that a normal app would run
21
  QUERIES=(
22
+ "SELECT id, first_name, last_name FROM ${DB_NAME}.patients LIMIT 5"
23
+ "SELECT id, status, specialist FROM ${DB_NAME}.patient_referrals ORDER BY created_at DESC LIMIT 3"
24
+ "SELECT COUNT(*) FROM ${DB_NAME}.patient_referrals WHERE status='Pending'"
25
+ "SELECT id, amount_due, status FROM ${DB_NAME}.billing WHERE status='Open'"
26
+ "SELECT username, role, department FROM ${DB_NAME}.users LIMIT 10"
27
+ "UPDATE ${DB_NAME}.billing SET last_updated=CURDATE() WHERE id=5001"
28
+ "SELECT p.first_name, p.last_name, r.status FROM ${DB_NAME}.patients p JOIN ${DB_NAME}.patient_referrals r ON p.id=r.patient_id LIMIT 5"
29
+ "INSERT INTO ${DB_NAME}.access_log (user_id, action, ip) VALUES (3, 'view_referrals', '10.0.1.10')"
30
  )
31
 
32
  # App database credentials (non-privileged)
src/open_range/builder/npc/http_traffic.sh CHANGED
@@ -20,23 +20,22 @@ INTERVAL=$(awk "BEGIN {printf \"%.1f\", 60.0 / $RATE_LAMBDA}")
20
  PAGES=(
21
  "/"
22
  "/index.php"
23
- "/about"
24
- "/products"
25
- "/contact"
26
- "/login"
27
- "/api/users/1/profile"
28
- "/search?q=quarterly+report"
29
- "/search?q=product+catalog"
30
- "/download?file=report.pdf"
31
- "/dashboard"
32
- "/help"
33
  )
34
 
35
  # Common form submissions
36
  FORMS=(
37
- "-d 'username=jsmith&password=Welcome1' http://${WEB_HOST}/login"
38
- "-d 'q=marketing+materials' http://${WEB_HOST}/search"
39
- "-d 'name=Janet&email=jsmith@acmecorp.local&message=Hello' http://${WEB_HOST}/contact"
40
  )
41
 
42
  echo "[NPC-HTTP] Starting HTTP traffic to ${WEB_HOST} at ${RATE_LAMBDA} req/min"
 
20
  PAGES=(
21
  "/"
22
  "/index.php"
23
+ "/dashboard.php"
24
+ "/lookup.php?last=Smith"
25
+ "/lookup.php?last=Hart"
26
+ "/lookup.php?last=Bishop"
27
+ "/referral_submit.php"
28
+ "/api/referral_status.php?id=1001"
29
+ "/api/referral_status.php?id=1002"
30
+ "/admin/compliance_report.php"
31
+ "/logout.php"
 
32
  )
33
 
34
  # Common form submissions
35
  FORMS=(
36
+ "-d 'username=mgarcia&password=Welcome2024!' http://${WEB_HOST}/index.php"
37
+ "-d 'username=kwilliams&password=Welcome2024!' http://${WEB_HOST}/index.php"
38
+ "-d 'patient_id=1&clinic=Northside&specialist=Dr.Patel&diagnosis=Cardiology' http://${WEB_HOST}/referral_submit.php"
39
  )
40
 
41
  echo "[NPC-HTTP] Starting HTTP traffic to ${WEB_HOST} at ${RATE_LAMBDA} req/min"
src/open_range/builder/npc/npc_agent.py CHANGED
@@ -1,8 +1,10 @@
1
  """LLM-driven NPC agent (Level 1).
2
 
3
- Each NPC has a persona card and polls for incoming stimuli (emails, chat
4
- messages) on a configurable interval. The agent decides how to respond
5
- using an LLM call via LiteLLM.
 
 
6
  """
7
 
8
  from __future__ import annotations
@@ -11,40 +13,61 @@ import asyncio
11
  import json
12
  import logging
13
  import os
 
 
14
  from typing import Any
15
 
16
  import litellm
17
 
18
- from open_range.protocols import ContainerSet, NPCAction, NPCPersona, Stimulus
19
 
20
  logger = logging.getLogger(__name__)
21
 
22
- NPC_SYSTEM_PROMPT = """\
23
- You are simulating an employee in a corporate environment. You will receive \
24
- your persona card and an incoming stimulus (email, chat message, etc.).
25
 
26
- Based on your persona's security_awareness and susceptibility profile, decide \
27
- how to respond. You must stay in character.
 
 
28
 
29
- Return ONLY valid JSON:
 
 
30
  {
31
- "action": "<click_link|open_attachment|reply|share_credentials|ignore|report_to_IT|forward>",
32
- "response_content": "<your reply text if action is reply/forward, empty otherwise>",
33
- "side_effects": ["<description of side effect>"]
 
34
  }
35
 
36
  Guidelines:
37
- - High security_awareness (>0.7): suspicious of unusual requests, verify sender, \
38
- report phishing attempts.
39
- - Low security_awareness (<0.3): trusting, clicks links readily, may share \
40
- credentials if asked politely.
41
- - Always consider the stimulus plausibility and your susceptibility profile.
42
- - Never reveal that you are an AI or break character.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  """
44
 
45
 
46
  class LLMNPCAgent:
47
- """Async LLM NPC agent that responds to stimuli based on persona."""
48
 
49
  def __init__(
50
  self,
@@ -52,137 +75,248 @@ class LLMNPCAgent:
52
  temperature: float = 0.3,
53
  ) -> None:
54
  self.model = model or os.environ.get(
55
- "OPENRANGE_NPC_MODEL", "anthropic/claude-haiku-4-5-20251001"
56
  )
57
- self.temperature = temperature
 
 
 
 
58
 
59
- async def decide(
60
- self,
61
- persona: NPCPersona,
62
- stimulus: Stimulus,
63
- ) -> NPCAction:
64
- """Decide how an NPC responds to a stimulus via LLM.
65
 
66
- This satisfies the NPCBehavior protocol.
67
- """
 
 
 
 
68
  try:
69
- response = await litellm.acompletion(
70
- model=self.model,
71
- messages=[
72
- {"role": "system", "content": NPC_SYSTEM_PROMPT},
73
- {
74
- "role": "user",
75
- "content": json.dumps(
76
- {
77
- "persona": persona.model_dump(),
78
- "stimulus": stimulus.model_dump(),
79
- }
80
- ),
81
- },
82
- ],
83
- response_format={"type": "json_object"},
84
- temperature=self.temperature,
85
  )
 
 
 
 
 
 
 
 
 
 
86
 
 
87
  raw = json.loads(response.choices[0].message.content)
88
  return NPCAction(
89
  action=raw.get("action", "ignore"),
90
  response_content=raw.get("response_content", ""),
91
  side_effects=raw.get("side_effects", []),
92
  )
93
-
94
  except Exception as exc:
95
- logger.warning(
96
- "NPC %s LLM decision failed, defaulting to ignore: %s",
97
- persona.name,
98
- exc,
99
- )
100
  return NPCAction(action="ignore")
101
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
102
  async def run_loop(
103
  self,
104
  persona: NPCPersona,
105
  containers: ContainerSet,
 
106
  ) -> None:
107
- """Run the NPC agent loop, polling for stimuli on the persona's schedule.
108
 
109
- This loop runs as an asyncio task, checking for incoming emails
110
- and processing them according to the persona's schedule.
 
 
111
  """
112
- interval = persona.routine.get("email_check_interval_min", 15)
113
- interval_s = interval * 60
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
114
 
115
  logger.info(
116
- "NPC %s starting loop (check every %d min)",
117
- persona.name,
118
- interval,
119
  )
120
 
121
  while True:
122
  try:
123
- await asyncio.sleep(interval_s)
 
 
 
 
 
 
 
 
 
 
124
 
125
- # In a full implementation, this would:
126
- # 1. docker exec into mail container to check persona's mailbox
127
- # 2. Parse new emails into Stimulus objects
128
- # 3. Call self.decide() for each stimulus
129
- # 4. Execute side effects (click links, reply, etc.)
130
- #
131
- # For now, the loop just keeps the task alive.
132
- # The actual stimulus injection happens when Red sends
133
- # phishing emails via the environment's step() method.
134
 
135
- logger.debug("NPC %s checked mailbox (no new stimuli)", persona.name)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
136
 
137
  except asyncio.CancelledError:
138
- logger.info("NPC %s loop cancelled", persona.name)
139
  break
140
  except Exception as exc:
141
  logger.warning("NPC %s loop error: %s", persona.name, exc)
142
- await asyncio.sleep(30) # back off on error
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
143
 
144
 
145
  class NullNPCBehavior:
146
- """No-op NPC behavior for Level 0 (shell scripts handle everything)."""
147
 
148
- async def decide(
149
- self,
150
- persona: NPCPersona,
151
- stimulus: Stimulus,
152
- ) -> NPCAction:
153
- """Always ignore -- Level 0 NPCs don't process stimuli."""
154
  return NPCAction(action="ignore")
155
 
156
 
157
  class RuleBasedNPCBehavior:
158
- """Heuristic NPC decisions based on susceptibility scores. No LLM calls."""
159
 
160
- async def decide(
161
- self,
162
- persona: NPCPersona,
163
- stimulus: Stimulus,
164
- ) -> NPCAction:
165
- """Decide based on persona susceptibility and stimulus plausibility."""
166
- # Get the susceptibility score for this stimulus type
167
  susceptibility = persona.susceptibility.get(
168
- f"{stimulus.type}", persona.susceptibility.get("phishing_email", 0.5)
169
  )
170
  score = stimulus.plausibility * susceptibility
171
-
172
  if persona.security_awareness > 0.7 and score < 0.8:
173
- return NPCAction(
174
- action="report_to_IT",
175
- side_effects=["reported suspicious email to IT"],
176
- )
177
  elif score > 0.6:
178
- return NPCAction(
179
- action="click_link",
180
- side_effects=["clicked link in email"],
181
- )
182
  elif score > 0.3:
183
  return NPCAction(action="ignore")
184
  else:
185
- return NPCAction(
186
- action="report_to_IT",
187
- side_effects=["forwarded suspicious email to security team"],
188
- )
 
1
  """LLM-driven NPC agent (Level 1).
2
 
3
+ Each NPC autonomously lives their workday -- browsing pages, emailing
4
+ colleagues, querying records, accessing shares. Available actions are
5
+ derived from the SnapshotSpec so they adapt to whatever environment the
6
+ Builder LLM generated. NPCs also react to incoming stimuli (phishing,
7
+ social engineering) based on their security_awareness profile.
8
  """
9
 
10
  from __future__ import annotations
 
13
  import json
14
  import logging
15
  import os
16
+ import random
17
+ import time
18
  from typing import Any
19
 
20
  import litellm
21
 
22
+ from open_range.protocols import ContainerSet, NPCAction, NPCPersona, SnapshotSpec, Stimulus
23
 
24
  logger = logging.getLogger(__name__)
25
 
26
+ # ---------------------------------------------------------------------------
27
+ # Prompts
28
+ # ---------------------------------------------------------------------------
29
 
30
+ NPC_ROUTINE_PROMPT = """\
31
+ You are simulating a corporate employee going about their normal workday. \
32
+ You will receive your persona, the available resources in this environment, \
33
+ and your recent activity.
34
 
35
+ Pick the NEXT natural work action. Stay in character for your role.
36
+
37
+ Return valid JSON:
38
  {
39
+ "action": "<browse|send_email|lookup|access_share|login|query_db|idle>",
40
+ "target": "<a specific target from the available resources>",
41
+ "detail": "<1-sentence description of what you're doing and why>",
42
+ "email_body": "<if send_email, the message; otherwise empty>"
43
  }
44
 
45
  Guidelines:
46
+ - Pick targets from the AVAILABLE RESOURCES provided (real pages, shares, users)
47
+ - Vary your actions -- don't repeat the same thing twice in a row
48
+ - Be mundane and realistic: checking records, status updates, reviewing docs
49
+ - Your role determines what you'd naturally do (clinical staff look up records, \
50
+ admin checks billing, IT reviews systems, executives read reports)
51
+ """
52
+
53
+ NPC_REACT_PROMPT = """\
54
+ You are simulating an employee who received an unexpected message. \
55
+ Based on your security_awareness and susceptibility, decide how to respond. \
56
+ Stay in character. Return valid JSON:
57
+
58
+ {
59
+ "action": "<click_link|open_attachment|reply|share_credentials|ignore|report_to_IT|forward>",
60
+ "response_content": "<your reply text if replying, otherwise empty>",
61
+ "side_effects": ["<what happens as a result>"]
62
+ }
63
+
64
+ - security_awareness > 0.7: verify sender, check URLs, report suspicious messages
65
+ - security_awareness < 0.3: trusting, clicks links, may share credentials if asked
66
  """
67
 
68
 
69
  class LLMNPCAgent:
70
+ """Async NPC agent that autonomously lives its workday via LLM."""
71
 
72
  def __init__(
73
  self,
 
75
  temperature: float = 0.3,
76
  ) -> None:
77
  self.model = model or os.environ.get(
78
+ "OPENRANGE_NPC_MODEL", "azure/gpt-5.2-codex"
79
  )
80
+ if "codex" in self.model.lower():
81
+ self.temperature: float | None = None
82
+ else:
83
+ self.temperature = temperature
84
+ self._actions: list[dict[str, Any]] = []
85
 
86
+ def get_actions(self) -> list[dict[str, Any]]:
87
+ """Return all recorded NPC actions for SIEM consumption."""
88
+ return list(self._actions)
 
 
 
89
 
90
+ # ------------------------------------------------------------------
91
+ # Reactive: respond to external stimulus
92
+ # ------------------------------------------------------------------
93
+
94
+ async def decide(self, persona: NPCPersona, stimulus: Stimulus) -> NPCAction:
95
+ """Decide how to respond to a stimulus (NPCBehavior protocol)."""
96
  try:
97
+ user_payload = (
98
+ "Respond as this NPC employee in valid JSON.\n\n"
99
+ + json.dumps({
100
+ "persona": persona.model_dump(),
101
+ "stimulus": stimulus.model_dump(),
102
+ })
 
 
 
 
 
 
 
 
 
 
103
  )
104
+ kwargs: dict[str, Any] = {
105
+ "model": self.model,
106
+ "messages": [
107
+ {"role": "system", "content": NPC_REACT_PROMPT},
108
+ {"role": "user", "content": user_payload},
109
+ ],
110
+ "response_format": {"type": "json_object"},
111
+ }
112
+ if self.temperature is not None:
113
+ kwargs["temperature"] = self.temperature
114
 
115
+ response = await litellm.acompletion(**kwargs)
116
  raw = json.loads(response.choices[0].message.content)
117
  return NPCAction(
118
  action=raw.get("action", "ignore"),
119
  response_content=raw.get("response_content", ""),
120
  side_effects=raw.get("side_effects", []),
121
  )
 
122
  except Exception as exc:
123
+ logger.warning("NPC %s react failed: %s", persona.name, exc)
 
 
 
 
124
  return NPCAction(action="ignore")
125
 
126
+ # ------------------------------------------------------------------
127
+ # Proactive: what to do next at work (derived from snapshot)
128
+ # ------------------------------------------------------------------
129
+
130
+ async def next_routine_action(
131
+ self, persona: NPCPersona, env_context: dict[str, Any],
132
+ ) -> dict[str, str]:
133
+ """Ask LLM what this NPC would naturally do next.
134
+
135
+ env_context contains available_pages, available_shares, etc.
136
+ derived from the SnapshotSpec so the LLM picks real targets.
137
+ """
138
+ recent = [
139
+ f"{a.get('action','?')}: {a.get('detail','')}"
140
+ for a in self._actions[-5:]
141
+ ]
142
+ try:
143
+ user_payload = (
144
+ "Pick this employee's next work action in valid JSON.\n\n"
145
+ + json.dumps({
146
+ "persona": {
147
+ "name": persona.name,
148
+ "role": persona.role,
149
+ "department": persona.department,
150
+ },
151
+ "available_resources": env_context,
152
+ "recent_actions": recent,
153
+ })
154
+ )
155
+ kwargs: dict[str, Any] = {
156
+ "model": self.model,
157
+ "messages": [
158
+ {"role": "system", "content": NPC_ROUTINE_PROMPT},
159
+ {"role": "user", "content": user_payload},
160
+ ],
161
+ "response_format": {"type": "json_object"},
162
+ }
163
+ if self.temperature is not None:
164
+ kwargs["temperature"] = self.temperature
165
+
166
+ response = await litellm.acompletion(**kwargs)
167
+ return json.loads(response.choices[0].message.content)
168
+ except Exception as exc:
169
+ logger.debug("NPC %s routine LLM failed: %s", persona.name, exc)
170
+ return _fallback_action(persona, env_context)
171
+
172
+ # ------------------------------------------------------------------
173
+ # Main loop
174
+ # ------------------------------------------------------------------
175
+
176
  async def run_loop(
177
  self,
178
  persona: NPCPersona,
179
  containers: ContainerSet,
180
+ snapshot: SnapshotSpec,
181
  ) -> None:
182
+ """Run the NPC's autonomous workday.
183
 
184
+ Each cycle:
185
+ 1. Pick and execute a routine work action
186
+ 2. Check mailbox for incoming stimuli (phishing)
187
+ 3. React to any stimuli found
188
  """
189
+ from open_range.builder.npc.actions import NPCActionExecutor
190
+
191
+ executor = NPCActionExecutor(containers, snapshot)
192
+
193
+ # Build environment context once from snapshot
194
+ env_context = {
195
+ "pages": executor._pages,
196
+ "shares": executor._shares,
197
+ "db_tables": executor._db_tables,
198
+ "colleagues": executor._users,
199
+ }
200
+
201
+ email_acct = persona.accounts.get("email", "")
202
+ mail_user = (
203
+ email_acct.split("@")[0]
204
+ if "@" in email_acct
205
+ else persona.name.lower().split()[0]
206
+ )
207
+
208
+ base_interval = persona.routine.get("action_interval_min", 2)
209
+ interval_s = base_interval * 60
210
 
211
  logger.info(
212
+ "NPC %s (%s) starting workday (every %dm, %d pages, %d shares)",
213
+ persona.name, persona.role, base_interval,
214
+ len(env_context["pages"]), len(env_context["shares"]),
215
  )
216
 
217
  while True:
218
  try:
219
+ # --- Phase 1: Routine work action ---
220
+ routine = await self.next_routine_action(persona, env_context)
221
+ log_entry = await executor.execute_routine(
222
+ persona,
223
+ routine.get("action", "idle"),
224
+ routine.get("target", ""),
225
+ routine.get("detail", ""),
226
+ routine.get("email_body", ""),
227
+ )
228
+ self._actions.append(log_entry)
229
+ logger.debug("NPC %s: %s", persona.name, log_entry.get("detail", ""))
230
 
231
+ # --- Phase 2: Check mailbox ---
232
+ try:
233
+ mail_output = await containers.exec(
234
+ "mail",
235
+ f"find /var/mail/{mail_user} "
236
+ f"-newer /tmp/.npc_check_{mail_user} "
237
+ f"-type f 2>/dev/null | head -1",
238
+ )
239
+ await containers.exec("mail", f"touch /tmp/.npc_check_{mail_user}")
240
 
241
+ if mail_output and mail_output.strip():
242
+ email_file = mail_output.strip().split("\n")[0]
243
+ content = await containers.exec(
244
+ "mail", f"head -50 '{email_file}' 2>/dev/null || true",
245
+ )
246
+ if content and content.strip():
247
+ stimulus = Stimulus(
248
+ type="email", sender="unknown",
249
+ subject="Incoming message",
250
+ content=content[:500],
251
+ )
252
+ react = await self.decide(persona, stimulus)
253
+ react_log = await executor.execute(persona, react)
254
+ react_log["stimulus_type"] = "email"
255
+ react_log["reactive"] = True
256
+ self._actions.append(react_log)
257
+ except Exception as mail_exc:
258
+ logger.debug("NPC %s mail check: %s", persona.name, mail_exc)
259
+
260
+ # --- Sleep with jitter ---
261
+ await asyncio.sleep(interval_s * random.uniform(0.7, 1.3))
262
 
263
  except asyncio.CancelledError:
264
+ logger.info("NPC %s workday ended", persona.name)
265
  break
266
  except Exception as exc:
267
  logger.warning("NPC %s loop error: %s", persona.name, exc)
268
+ await asyncio.sleep(30)
269
+
270
+
271
+ # ---------------------------------------------------------------------------
272
+ # Fallback routine (no LLM, picks from snapshot-derived resources)
273
+ # ---------------------------------------------------------------------------
274
+
275
+
276
+ def _fallback_action(persona: NPCPersona, env: dict[str, Any]) -> dict[str, str]:
277
+ """Pick a routine action without LLM, using available resources."""
278
+ pages = env.get("pages", ["/"])
279
+ shares = env.get("shares", ["general"])
280
+ colleagues = env.get("colleagues", [])
281
+
282
+ actions = [
283
+ {"action": "browse", "target": random.choice(pages) if pages else "/", "detail": "Checking portal"},
284
+ {"action": "browse", "target": random.choice(pages) if pages else "/", "detail": "Reviewing page"},
285
+ {"action": "idle", "target": "", "detail": "Reading documents at desk"},
286
+ ]
287
+ if shares:
288
+ actions.append({"action": "access_share", "target": random.choice(shares), "detail": "Checking files"})
289
+ if colleagues:
290
+ actions.append({"action": "send_email", "target": random.choice(colleagues), "detail": "Status update", "email_body": "Quick check-in on today's items."})
291
+
292
+ return random.choice(actions)
293
+
294
+
295
+ # ---------------------------------------------------------------------------
296
+ # Simpler behavior classes (Level 0, no LLM)
297
+ # ---------------------------------------------------------------------------
298
 
299
 
300
  class NullNPCBehavior:
301
+ """No-op NPC behavior for Level 0."""
302
 
303
+ async def decide(self, persona: NPCPersona, stimulus: Stimulus) -> NPCAction:
 
 
 
 
 
304
  return NPCAction(action="ignore")
305
 
306
 
307
  class RuleBasedNPCBehavior:
308
+ """Heuristic NPC decisions based on susceptibility scores."""
309
 
310
+ async def decide(self, persona: NPCPersona, stimulus: Stimulus) -> NPCAction:
 
 
 
 
 
 
311
  susceptibility = persona.susceptibility.get(
312
+ stimulus.type, persona.susceptibility.get("phishing_email", 0.5)
313
  )
314
  score = stimulus.plausibility * susceptibility
 
315
  if persona.security_awareness > 0.7 and score < 0.8:
316
+ return NPCAction(action="report_to_IT", side_effects=["reported suspicious email to IT"])
 
 
 
317
  elif score > 0.6:
318
+ return NPCAction(action="click_link", side_effects=["clicked link in email"])
 
 
 
319
  elif score > 0.3:
320
  return NPCAction(action="ignore")
321
  else:
322
+ return NPCAction(action="report_to_IT", side_effects=["forwarded to security team"])
 
 
 
src/open_range/cli.py CHANGED
@@ -105,6 +105,23 @@ def _write_snapshot(spec: "SnapshotSpec", output_dir: Path) -> Path:
105
  return dest
106
 
107
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
108
  # ---------------------------------------------------------------------------
109
  # CLI group
110
  # ---------------------------------------------------------------------------
@@ -185,6 +202,127 @@ def build(
185
  click.echo(f" Elapsed: {elapsed:.1f}s")
186
 
187
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
188
  # ---------------------------------------------------------------------------
189
  # render
190
  # ---------------------------------------------------------------------------
 
105
  return dest
106
 
107
 
108
+ def _parse_roles(raw: str) -> tuple[str, ...]:
109
+ """Parse a comma-separated role list."""
110
+ roles = tuple(dict.fromkeys(part.strip().lower() for part in raw.split(",") if part.strip()))
111
+ valid = {"red", "blue"}
112
+ invalid = [role for role in roles if role not in valid]
113
+ if invalid:
114
+ click.echo(
115
+ f"Error: invalid roles: {', '.join(invalid)}. Expected comma-separated values from: red, blue.",
116
+ err=True,
117
+ )
118
+ sys.exit(1)
119
+ if not roles:
120
+ click.echo("Error: at least one role must be selected.", err=True)
121
+ sys.exit(1)
122
+ return roles
123
+
124
+
125
  # ---------------------------------------------------------------------------
126
  # CLI group
127
  # ---------------------------------------------------------------------------
 
202
  click.echo(f" Elapsed: {elapsed:.1f}s")
203
 
204
 
205
+ # ---------------------------------------------------------------------------
206
+ # synthetic-data
207
+ # ---------------------------------------------------------------------------
208
+
209
+
210
+ @cli.command("synthetic-data")
211
+ @click.option("-o", "--output", required=True, type=click.Path(), help="Output JSONL path for synthetic trajectories.")
212
+ @click.option("-m", "--manifest", default=None, type=click.Path(exists=True), help="Path to manifest YAML.")
213
+ @click.option("-s", "--snapshot", default=None, type=click.Path(exists=True), help="Path to snapshot JSON.")
214
+ @click.option("--num-traces", default=10, type=click.IntRange(1), help="Number of synthetic episodes to generate.")
215
+ @click.option("--seed", default=None, type=int, help="Base random seed for reproducibility.")
216
+ @click.option("--tier", default=1, type=click.IntRange(1, 5), help="Tier level 1-5 when building from a manifest.")
217
+ @click.option("--max-steps", default=12, type=click.IntRange(1), help="Maximum red/blue turns per episode.")
218
+ @click.option("--roles", default="red", help="Comma-separated teacher/export roles: red, blue.")
219
+ @click.option("--reward-threshold", default=0.0, type=float, help="Minimum total role reward required for export.")
220
+ @click.option("--teacher-model", default=None, help="LiteLLM teacher model. If omitted, selected roles use scripted agents.")
221
+ @click.option("--red-model", default=None, help="Override model for Red teacher.")
222
+ @click.option("--blue-model", default=None, help="Override model for Blue teacher.")
223
+ @click.option("--temperature", default=0.2, type=float, help="Teacher sampling temperature.")
224
+ @click.option("--max-tokens", default=512, type=int, help="Maximum completion tokens per teacher action.")
225
+ @click.option("--template-only/--llm-builder", default=True, help="When using --manifest, build snapshots deterministically instead of via LLM.")
226
+ @click.option("--builder-model", default=None, help="LLM builder model when using --llm-builder.")
227
+ @click.option("--randomize-flags/--static-flags", default=True, help="Randomize flag values per synthetic episode.")
228
+ def synthetic_data(
229
+ output: str,
230
+ manifest: str | None,
231
+ snapshot: str | None,
232
+ num_traces: int,
233
+ seed: int | None,
234
+ tier: int,
235
+ max_steps: int,
236
+ roles: str,
237
+ reward_threshold: float,
238
+ teacher_model: str | None,
239
+ red_model: str | None,
240
+ blue_model: str | None,
241
+ temperature: float,
242
+ max_tokens: int,
243
+ template_only: bool,
244
+ builder_model: str | None,
245
+ randomize_flags: bool,
246
+ ) -> None:
247
+ """Generate snapshot-grounded synthetic SFT trajectories."""
248
+ from open_range.training.synthetic import (
249
+ SyntheticTraceGenerator,
250
+ build_teacher_agents,
251
+ )
252
+
253
+ if bool(manifest) == bool(snapshot):
254
+ click.echo("Error: provide exactly one of --manifest or --snapshot.", err=True)
255
+ sys.exit(1)
256
+
257
+ selected_roles = _parse_roles(roles)
258
+ resolved_teacher_model = (
259
+ teacher_model
260
+ or os.environ.get("OPENRANGE_SYNTH_MODEL")
261
+ )
262
+ red_agent, blue_agent = build_teacher_agents(
263
+ teacher_model=resolved_teacher_model,
264
+ roles=selected_roles,
265
+ red_model=red_model,
266
+ blue_model=blue_model,
267
+ temperature=temperature,
268
+ max_tokens=max_tokens,
269
+ )
270
+
271
+ if snapshot:
272
+ source_label = f"snapshot={snapshot}"
273
+ generator = SyntheticTraceGenerator(
274
+ snapshot=_load_snapshot(snapshot),
275
+ red_agent=red_agent,
276
+ blue_agent=blue_agent,
277
+ tier=tier,
278
+ max_steps=max_steps,
279
+ randomize_flags=randomize_flags,
280
+ )
281
+ else:
282
+ source_label = f"manifest={manifest}"
283
+ generator = SyntheticTraceGenerator.from_manifest(
284
+ _load_manifest(str(manifest)),
285
+ red_agent=red_agent,
286
+ blue_agent=blue_agent,
287
+ template_only=template_only,
288
+ builder_model=builder_model,
289
+ tier=tier,
290
+ max_steps=max_steps,
291
+ randomize_flags=randomize_flags,
292
+ )
293
+
294
+ teacher_roles = []
295
+ if selected_roles:
296
+ if red_model or resolved_teacher_model:
297
+ if "red" in selected_roles:
298
+ teacher_roles.append("red")
299
+ if blue_model or resolved_teacher_model:
300
+ if "blue" in selected_roles:
301
+ teacher_roles.append("blue")
302
+
303
+ click.echo(f"Generating synthetic traces from {source_label} ...")
304
+ click.echo(f" Roles: {', '.join(selected_roles)}")
305
+ click.echo(
306
+ " Teacher roles: "
307
+ + (", ".join(teacher_roles) if teacher_roles else "none (scripted fallbacks)")
308
+ )
309
+ try:
310
+ logger, count = generator.export_jsonl(
311
+ output,
312
+ num_traces=num_traces,
313
+ seed=seed,
314
+ reward_threshold=reward_threshold,
315
+ roles=selected_roles,
316
+ )
317
+ except Exception as exc:
318
+ click.echo(f"Error: synthetic data generation failed: {exc}", err=True)
319
+ sys.exit(1)
320
+
321
+ click.echo(f"Wrote {count} JSONL records to {output}")
322
+ click.echo(f" Episodes: {len(logger.episodes)}")
323
+ click.echo(f" Randomized flags: {'yes' if randomize_flags else 'no'}")
324
+
325
+
326
  # ---------------------------------------------------------------------------
327
  # render
328
  # ---------------------------------------------------------------------------
src/open_range/training/__init__.py CHANGED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Training utilities for OpenRange."""
2
+
3
+ from open_range.training.synthetic import (
4
+ SyntheticRangeEnvironment,
5
+ SyntheticTraceGenerator,
6
+ build_teacher_agents,
7
+ randomize_snapshot_flags,
8
+ )
9
+
10
+ __all__ = [
11
+ "SyntheticRangeEnvironment",
12
+ "SyntheticTraceGenerator",
13
+ "build_teacher_agents",
14
+ "randomize_snapshot_flags",
15
+ ]
src/open_range/training/synthetic.py ADDED
@@ -0,0 +1,717 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Synthetic trajectory generation for OpenRange.
2
+
3
+ This module provides a fast, snapshot-backed simulator for collecting
4
+ teacher-model trajectories without booting Docker containers. It is meant
5
+ for SFT warm-start data generation, not reward-faithful evaluation.
6
+ """
7
+
8
+ from __future__ import annotations
9
+
10
+ import asyncio
11
+ import logging
12
+ import random
13
+ import re
14
+ import shlex
15
+ from pathlib import Path
16
+ from typing import Any
17
+
18
+ from open_range.agents.llm_agent import LLMRangeAgent
19
+ from open_range.agents.protocol import RangeAgent
20
+ from open_range.agents.scripted_agent import ScriptedBlueAgent, ScriptedRedAgent
21
+ from open_range.builder.builder import LLMSnapshotBuilder, TemplateOnlyBuilder
22
+ from open_range.protocols import BuildContext, SnapshotBuilder, SnapshotSpec, Vulnerability
23
+ from open_range.server.environment import RangeEnvironment
24
+ from open_range.server.models import RangeAction, RangeObservation
25
+ from open_range.training.trajectory import TrajectoryLogger
26
+
27
+ logger = logging.getLogger(__name__)
28
+
29
+ _TOKEN_RE = re.compile(r"[a-z0-9_./:-]+")
30
+
31
+
32
+ def _run_async(coro: Any) -> Any:
33
+ """Run an async coroutine from synchronous code."""
34
+ try:
35
+ loop = asyncio.get_running_loop()
36
+ except RuntimeError:
37
+ loop = None
38
+
39
+ if loop and loop.is_running():
40
+ import concurrent.futures
41
+
42
+ with concurrent.futures.ThreadPoolExecutor() as pool:
43
+ return pool.submit(asyncio.run, coro).result()
44
+ return asyncio.run(coro)
45
+
46
+
47
+ def _iter_hosts(snapshot: SnapshotSpec) -> list[str]:
48
+ raw_hosts = snapshot.topology.get("hosts", [])
49
+ hosts: list[str] = []
50
+ for host in raw_hosts:
51
+ if isinstance(host, dict):
52
+ name = str(host.get("name", "")).strip()
53
+ else:
54
+ name = str(host).strip()
55
+ if name:
56
+ hosts.append(name)
57
+ return hosts
58
+
59
+
60
+ def _deep_replace(value: Any, replacements: dict[str, str]) -> Any:
61
+ if isinstance(value, str):
62
+ result = value
63
+ for old, new in replacements.items():
64
+ result = result.replace(old, new)
65
+ return result
66
+ if isinstance(value, list):
67
+ return [_deep_replace(item, replacements) for item in value]
68
+ if isinstance(value, dict):
69
+ return {key: _deep_replace(item, replacements) for key, item in value.items()}
70
+ return value
71
+
72
+
73
+ def randomize_snapshot_flags(snapshot: SnapshotSpec, seed: int | None = None) -> SnapshotSpec:
74
+ """Clone *snapshot* with unique flag values substituted throughout."""
75
+ if not snapshot.flags:
76
+ return snapshot.model_copy(deep=True)
77
+
78
+ rng = random.Random(seed)
79
+ replacements: dict[str, str] = {}
80
+ for flag in snapshot.flags:
81
+ inner = "".join(rng.choice("abcdef0123456789") for _ in range(16))
82
+ replacements[flag.value] = f"FLAG{{{inner}}}"
83
+
84
+ payload = snapshot.model_dump(mode="python")
85
+ payload = _deep_replace(payload, replacements)
86
+ return SnapshotSpec.model_validate(payload)
87
+
88
+
89
+ def _observation_text(observation: str | RangeObservation) -> str:
90
+ """Convert an observation into training text without reward leakage."""
91
+ if isinstance(observation, str):
92
+ return observation
93
+
94
+ parts: list[str] = []
95
+ if observation.stdout:
96
+ parts.append(observation.stdout)
97
+ if observation.stderr:
98
+ parts.append(f"STDERR:\n{observation.stderr}")
99
+ if observation.alerts:
100
+ parts.append("ALERTS:\n" + "\n".join(f"- {alert}" for alert in observation.alerts))
101
+ if observation.flags_captured:
102
+ parts.append(
103
+ "FLAGS CAPTURED:\n"
104
+ + "\n".join(f"- {flag}" for flag in observation.flags_captured)
105
+ )
106
+ return "\n\n".join(parts)
107
+
108
+
109
+ class SyntheticRangeEnvironment(RangeEnvironment):
110
+ """Fast, deterministic simulator built from a ``SnapshotSpec``."""
111
+
112
+ def __init__(
113
+ self,
114
+ *,
115
+ randomize_flags: bool = True,
116
+ max_steps: int = 30,
117
+ ) -> None:
118
+ super().__init__(docker_available=False, max_steps=max_steps)
119
+ self._randomize_flags = randomize_flags
120
+ self._synthetic_seed: int | None = None
121
+ self._ephemeral_files: dict[str, str] = {}
122
+
123
+ def reset(
124
+ self,
125
+ seed: int | None = None,
126
+ episode_id: str | None = None,
127
+ **kwargs: Any,
128
+ ) -> RangeObservation:
129
+ self._synthetic_seed = seed
130
+ self._ephemeral_files = {}
131
+ return super().reset(seed=seed, episode_id=episode_id, **kwargs)
132
+
133
+ def _select_snapshot(self, **kwargs: Any) -> SnapshotSpec:
134
+ snapshot = super()._select_snapshot(**kwargs)
135
+ if not self._randomize_flags:
136
+ return snapshot.model_copy(deep=True)
137
+ return randomize_snapshot_flags(snapshot, seed=self._synthetic_seed)
138
+
139
+ def _exec_in_container(
140
+ self,
141
+ container_name: str,
142
+ command: str,
143
+ timeout_s: float | None = None,
144
+ ) -> tuple[str, str]:
145
+ del container_name, timeout_s # unused in the synthetic executor
146
+ if self._snapshot is None:
147
+ return "", "No snapshot loaded"
148
+ if self._state.mode == "blue":
149
+ return self._simulate_blue_command(command)
150
+ return self._simulate_red_command(command)
151
+
152
+ def _simulate_red_command(self, command: str) -> tuple[str, str]:
153
+ normalized = command.strip().lower()
154
+ if not normalized:
155
+ return "", "Empty command"
156
+
157
+ exact_step = self._match_golden_step(command)
158
+ if exact_step is not None:
159
+ return self._render_golden_output(command, exact_step), ""
160
+
161
+ if normalized == "whoami":
162
+ return "kali\n", ""
163
+ if normalized == "pwd":
164
+ return "/root\n", ""
165
+ if normalized.startswith("ls"):
166
+ return self._render_ls(command), ""
167
+ if normalized.startswith("cat "):
168
+ return self._render_cat(command)
169
+ if "nmap" in normalized:
170
+ return self._render_nmap(command), ""
171
+ if "curl" in normalized:
172
+ return self._render_curl(command), ""
173
+ if "mysql" in normalized:
174
+ return self._render_mysql(command), ""
175
+ if "smbclient" in normalized:
176
+ return self._render_smb(command), ""
177
+ if "ldapsearch" in normalized:
178
+ return self._render_ldap(command), ""
179
+ if re.search(r"\bssh\b|\bsshpass\b", normalized):
180
+ return self._render_ssh(command), ""
181
+ if "grep" in normalized and "flag" in normalized:
182
+ return self._render_flag_search(), ""
183
+
184
+ return "Command completed successfully.\n", ""
185
+
186
+ def _simulate_blue_command(self, command: str) -> tuple[str, str]:
187
+ normalized = command.strip().lower()
188
+ if not normalized:
189
+ return "", "Empty command"
190
+
191
+ if any(token in normalized for token in ("grep", "tail", "cat", "awk", "sed")):
192
+ return self._render_siem_query(command), ""
193
+ if "check_services" in normalized:
194
+ return self._render_service_status(), ""
195
+ if "iptables" in normalized or "ufw" in normalized or "firewall" in normalized:
196
+ return "Firewall policy updated.\n", ""
197
+ if normalized.startswith("patch "):
198
+ return "Patch applied in synthetic environment.\n", ""
199
+ if "restart" in normalized:
200
+ return "Service restarted.\n", ""
201
+ return "Investigation command completed.\n", ""
202
+
203
+ def _match_golden_step(self, command: str):
204
+ if self._snapshot is None:
205
+ return None
206
+
207
+ normalized = self._normalize_command(command)
208
+ best_step = None
209
+ best_score = 0.0
210
+ cmd_name = self._command_name(command)
211
+
212
+ for step in self._snapshot.golden_path:
213
+ step_normalized = self._normalize_command(step.command)
214
+ if normalized == step_normalized:
215
+ return step
216
+ if cmd_name != self._command_name(step.command):
217
+ continue
218
+ score = self._token_overlap(normalized, step_normalized)
219
+ if score > best_score:
220
+ best_score = score
221
+ best_step = step
222
+
223
+ if best_score >= 0.66:
224
+ return best_step
225
+ return None
226
+
227
+ @staticmethod
228
+ def _command_name(command: str) -> str:
229
+ stripped = command.strip()
230
+ if not stripped:
231
+ return ""
232
+ return stripped.split()[0].rsplit("/", 1)[-1].lower()
233
+
234
+ @staticmethod
235
+ def _normalize_command(command: str) -> str:
236
+ lowered = command.lower()
237
+ return " ".join(_TOKEN_RE.findall(lowered))
238
+
239
+ @staticmethod
240
+ def _token_overlap(left: str, right: str) -> float:
241
+ left_tokens = set(left.split())
242
+ right_tokens = set(right.split())
243
+ if not left_tokens or not right_tokens:
244
+ return 0.0
245
+ intersection = left_tokens & right_tokens
246
+ union = left_tokens | right_tokens
247
+ return len(intersection) / len(union)
248
+
249
+ def _render_golden_output(self, command: str, step: Any) -> str:
250
+ expected = step.expect_in_stdout or "Command completed."
251
+ lowered = command.lower()
252
+ if "nmap" in lowered:
253
+ return f"Starting Nmap 7.94\n{expected}\nNmap done.\n"
254
+ if "curl" in lowered and "search" in lowered and ("union" in lowered or "flag" in lowered):
255
+ return f"Search results:\n{expected}\n"
256
+ if "curl" in lowered:
257
+ return f"{expected}\n"
258
+ if "mysql" in lowered:
259
+ return f"{expected}\n"
260
+ return f"{expected}\n"
261
+
262
+ def _render_nmap(self, command: str) -> str:
263
+ lines = ["Starting Nmap 7.94"]
264
+ lowered = command.lower()
265
+ if "10.0.1" in lowered or "web" in lowered:
266
+ lines.extend(
267
+ [
268
+ "80/tcp open http nginx 1.24",
269
+ "25/tcp open smtp postfix",
270
+ ]
271
+ )
272
+ if "10.0.2" in lowered or "db" in lowered:
273
+ lines.extend(
274
+ [
275
+ "3306/tcp open mysql MySQL 8.0",
276
+ "445/tcp open smb samba 4.17",
277
+ ]
278
+ )
279
+ if "10.0.3" in lowered or "ldap" in lowered or "siem" in lowered:
280
+ lines.extend(
281
+ [
282
+ "389/tcp open ldap OpenLDAP 2.6",
283
+ "514/tcp open syslog rsyslogd",
284
+ ]
285
+ )
286
+ if len(lines) == 1:
287
+ lines.append("Host seems down.")
288
+ lines.append("Nmap done.")
289
+ return "\n".join(lines) + "\n"
290
+
291
+ def _render_curl(self, command: str) -> str:
292
+ lowered = command.lower()
293
+ flag = self._flag_value()
294
+ if "http://web/" in lowered and "login" not in lowered and "search" not in lowered:
295
+ hints = ["Welcome to the customer portal."]
296
+ if self._has_vuln_type("sqli"):
297
+ hints.append("Try /search?q=test")
298
+ if self._has_vuln_type("idor"):
299
+ hints.append("API available at /api/users/1/profile")
300
+ if self._has_vuln_type("path_traversal"):
301
+ hints.append("Downloads available at /download?file=report.pdf")
302
+ return "\n".join(hints) + "\n"
303
+ if "/login" in lowered:
304
+ return "Login\n<form action=\"/login\">\n"
305
+ if "search" in lowered:
306
+ if any(token in lowered for token in ("union", "select", "or+1=1", "or 1=1", "%27", "'")):
307
+ return f"Search results:\n{flag}\n"
308
+ return "products\nmonitor\nlaptop\nrouter\n"
309
+ if "/api/users/" in lowered and self._has_vuln_type("idor"):
310
+ if "/1/" in lowered:
311
+ return '{"id":1,"username":"admin","role":"admin"}\n'
312
+ return '{"id":2,"username":"svc_backup","password":"backup123"}\n'
313
+ if "download?file=" in lowered and self._has_vuln_type("path_traversal"):
314
+ if "passwd" in lowered:
315
+ return "root:x:0:0:root:/root:/bin/bash\nwww-data:x:33:33:www-data:/var/www:/usr/sbin/nologin\n"
316
+ if "flag" in lowered or "/var/flags" in lowered:
317
+ return f"{flag}\n"
318
+ return "Quarterly report.pdf\n"
319
+ if "http://mail" in lowered:
320
+ return "220 mail ESMTP Postfix\n"
321
+ return "HTTP/1.1 200 OK\n"
322
+
323
+ def _render_mysql(self, command: str) -> str:
324
+ lowered = command.lower()
325
+ flag = self._flag_value()
326
+ if "show databases" in lowered:
327
+ return "information_schema\nreferral_db\nflags\n"
328
+ if "select" in lowered and "flag" in lowered:
329
+ return f"{flag}\n"
330
+ if "show tables" in lowered:
331
+ return "users\nproducts\nsecrets\n"
332
+ return "Query OK\n"
333
+
334
+ def _render_smb(self, command: str) -> str:
335
+ lowered = command.lower()
336
+ if " -l " in f" {lowered} ":
337
+ return "Sharename Type Comment\ngeneral Disk General share\ncontracts Disk Contracts\n"
338
+ if "get " in lowered:
339
+ creds = "svc_backup:backup123\n"
340
+ target = self._extract_copy_destination(command) or "/tmp/creds.txt"
341
+ self._ephemeral_files[target] = creds
342
+ return f"getting file to {target}\n"
343
+ return " .\n ..\n admin_creds.txt\n"
344
+
345
+ def _render_ldap(self, command: str) -> str:
346
+ del command
347
+ return (
348
+ "dn: uid=alice,dc=corp,dc=local\nuid: alice\n\n"
349
+ "dn: uid=bob,dc=corp,dc=local\nuid: bob\n"
350
+ )
351
+
352
+ def _render_ssh(self, command: str) -> str:
353
+ lowered = command.lower()
354
+ flag = self._flag_value()
355
+ if "cat /var/flags" in lowered or "grep flag" in lowered:
356
+ return f"{flag}\n"
357
+ return "uid=1001(svc_webapp) gid=1001(svc_webapp)\n"
358
+
359
+ def _render_flag_search(self) -> str:
360
+ return f"{self._flag_value()}\n"
361
+
362
+ def _render_cat(self, command: str) -> tuple[str, str]:
363
+ path = self._extract_first_path(command)
364
+ if not path:
365
+ return "", "cat: missing operand"
366
+ if path in self._ephemeral_files:
367
+ return self._ephemeral_files[path], ""
368
+ for flag in self._snapshot.flags if self._snapshot else []:
369
+ if path == flag.path or path.endswith(Path(flag.path).name):
370
+ return f"{flag.value}\n", ""
371
+ if self._snapshot and path in self._snapshot.files:
372
+ return self._snapshot.files[path], ""
373
+ if path.endswith("config.php"):
374
+ return (
375
+ "<?php\n"
376
+ "$DB_HOST='db';\n$DB_USER='app_user';\n$DB_PASS='AppUs3r!2024';\n"
377
+ "?>\n",
378
+ "",
379
+ )
380
+ if path.endswith("/etc/passwd") or path == "/etc/passwd":
381
+ return "root:x:0:0:root:/root:/bin/bash\nwww-data:x:33:33:www-data:/var/www:/usr/sbin/nologin\n", ""
382
+ return "", f"cat: {path}: No such file or directory"
383
+
384
+ def _render_ls(self, command: str) -> str:
385
+ path = self._extract_first_path(command) or "."
386
+ if path in (".", "/root"):
387
+ entries = ["notes.txt"]
388
+ entries.extend(sorted(Path(p).name for p in self._ephemeral_files))
389
+ return "\n".join(sorted(set(entries))) + "\n"
390
+ if path == "/var/log/siem":
391
+ return "consolidated\nalerts.log\nweb_access.log\n"
392
+ if self._snapshot and self._snapshot.files:
393
+ matching = [
394
+ Path(file_path).name
395
+ for file_path in self._snapshot.files
396
+ if str(Path(file_path).parent) == path
397
+ ]
398
+ if matching:
399
+ return "\n".join(sorted(set(matching))) + "\n"
400
+ return ".\n..\n"
401
+
402
+ def _render_siem_query(self, command: str) -> str:
403
+ lines = self._siem_lines()
404
+ if not lines:
405
+ return "No alerts matched.\n"
406
+
407
+ lowered = command.lower()
408
+ filtered = lines
409
+ if "union" in lowered or "select" in lowered:
410
+ filtered = [line for line in lines if "SQLi" in line or "UNION" in line]
411
+ elif "nmap" in lowered or "scan" in lowered:
412
+ filtered = [line for line in lines if "scan" in line.lower()]
413
+ elif "ldap" in lowered:
414
+ filtered = [line for line in lines if "LDAP" in line]
415
+
416
+ if not filtered:
417
+ filtered = lines
418
+ return "\n".join(filtered[-20:]) + "\n"
419
+
420
+ def _render_service_status(self) -> str:
421
+ hosts = _iter_hosts(self._snapshot) if self._snapshot else []
422
+ if not hosts:
423
+ return "No services registered.\n"
424
+ return "\n".join(f"{host}: healthy" for host in hosts if host != "attacker") + "\n"
425
+
426
+ def _siem_lines(self) -> list[str]:
427
+ lines: list[str] = []
428
+ for record in self._red_history:
429
+ command = str(record.get("command", ""))
430
+ lowered = command.lower()
431
+ if "nmap" in lowered:
432
+ lines.append(
433
+ f"[ALERT] Port scan detected from attacker host at step {record['step']}: {command}"
434
+ )
435
+ if "union" in lowered or ("select" in lowered and "search" in lowered):
436
+ lines.append(
437
+ f"[ALERT] SQLi pattern detected in web access log at step {record['step']}: {command}"
438
+ )
439
+ if "curl http://web/" in lowered:
440
+ lines.append(
441
+ f"[INFO] Web portal accessed from attacker at step {record['step']}"
442
+ )
443
+ if "ldapsearch" in lowered:
444
+ lines.append(
445
+ f"[ALERT] LDAP enumeration detected at step {record['step']}: {command}"
446
+ )
447
+ if not lines:
448
+ lines.append("[INFO] No suspicious activity observed.")
449
+ return lines
450
+
451
+ def _has_vuln_type(self, vuln_type: str) -> bool:
452
+ if self._snapshot is None:
453
+ return False
454
+ return any(v.type == vuln_type for v in self._snapshot.truth_graph.vulns)
455
+
456
+ def _flag_value(self) -> str:
457
+ if self._snapshot and self._snapshot.flags:
458
+ return self._snapshot.flags[0].value
459
+ return "FLAG{synthetic_missing_flag}"
460
+
461
+ @staticmethod
462
+ def _extract_copy_destination(command: str) -> str | None:
463
+ try:
464
+ parts = shlex.split(command)
465
+ except ValueError:
466
+ return None
467
+ if len(parts) >= 2:
468
+ candidate = parts[-1]
469
+ if candidate.startswith("/"):
470
+ return candidate
471
+ return None
472
+
473
+ @staticmethod
474
+ def _extract_first_path(command: str) -> str | None:
475
+ try:
476
+ parts = shlex.split(command)
477
+ except ValueError:
478
+ return None
479
+ for token in parts[1:]:
480
+ if token.startswith("/"):
481
+ return token
482
+ if "/" in token and not token.startswith("http"):
483
+ return token
484
+ return None
485
+
486
+
487
+ class SyntheticTraceGenerator:
488
+ """Generate OpenRange training traces from a simulated snapshot source."""
489
+
490
+ def __init__(
491
+ self,
492
+ *,
493
+ snapshot: SnapshotSpec | None = None,
494
+ manifest: dict[str, Any] | None = None,
495
+ builder: SnapshotBuilder | None = None,
496
+ red_agent: RangeAgent | None = None,
497
+ blue_agent: RangeAgent | None = None,
498
+ tier: int = 1,
499
+ max_steps: int = 30,
500
+ randomize_flags: bool = True,
501
+ ) -> None:
502
+ if snapshot is None and manifest is None:
503
+ raise ValueError("SyntheticTraceGenerator requires a snapshot or manifest")
504
+ self._snapshot = snapshot.model_copy(deep=True) if snapshot is not None else None
505
+ self._manifest = manifest
506
+ self._builder = builder
507
+ self._tier = tier
508
+ self._max_steps = max_steps
509
+ self._randomize_flags = randomize_flags
510
+ self.red_agent = red_agent or ScriptedRedAgent()
511
+ self.blue_agent = blue_agent or ScriptedBlueAgent()
512
+
513
+ @classmethod
514
+ def from_manifest(
515
+ cls,
516
+ manifest: dict[str, Any],
517
+ *,
518
+ red_agent: RangeAgent | None = None,
519
+ blue_agent: RangeAgent | None = None,
520
+ builder: SnapshotBuilder | None = None,
521
+ template_only: bool = True,
522
+ builder_model: str | None = None,
523
+ tier: int = 1,
524
+ max_steps: int = 30,
525
+ randomize_flags: bool = True,
526
+ ) -> "SyntheticTraceGenerator":
527
+ resolved_builder = builder
528
+ if resolved_builder is None:
529
+ if template_only:
530
+ resolved_builder = TemplateOnlyBuilder()
531
+ else:
532
+ resolved_builder = LLMSnapshotBuilder(
533
+ model=builder_model or "azure/gpt-5.2-codex"
534
+ )
535
+ return cls(
536
+ manifest=manifest,
537
+ builder=resolved_builder,
538
+ red_agent=red_agent,
539
+ blue_agent=blue_agent,
540
+ tier=tier,
541
+ max_steps=max_steps,
542
+ randomize_flags=randomize_flags,
543
+ )
544
+
545
+ def generate(
546
+ self,
547
+ *,
548
+ num_traces: int = 10,
549
+ seed: int | None = None,
550
+ ) -> TrajectoryLogger:
551
+ logger = TrajectoryLogger()
552
+ for index in range(num_traces):
553
+ episode_seed = None if seed is None else seed + index
554
+ snapshot = self._materialize_snapshot(episode_seed)
555
+ self._run_episode(
556
+ snapshot=snapshot,
557
+ logger=logger,
558
+ episode_index=index,
559
+ seed=episode_seed,
560
+ )
561
+ return logger
562
+
563
+ def export_jsonl(
564
+ self,
565
+ path: str | Path,
566
+ *,
567
+ num_traces: int = 10,
568
+ seed: int | None = None,
569
+ reward_threshold: float = 0.0,
570
+ roles: tuple[str, ...] = ("red", "blue"),
571
+ ) -> tuple[TrajectoryLogger, int]:
572
+ logger = self.generate(num_traces=num_traces, seed=seed)
573
+ count = logger.export_jsonl(path, reward_threshold=reward_threshold, roles=roles)
574
+ return logger, count
575
+
576
+ def _materialize_snapshot(self, seed: int | None) -> SnapshotSpec:
577
+ if self._snapshot is not None:
578
+ return self._snapshot.model_copy(deep=True)
579
+ if self._manifest is None or self._builder is None:
580
+ raise RuntimeError("Synthetic trace generator is missing its manifest builder")
581
+
582
+ context = BuildContext(seed=seed, tier=self._tier)
583
+ snapshot = _run_async(self._builder.build(self._manifest, context))
584
+ return snapshot
585
+
586
+ def _run_episode(
587
+ self,
588
+ *,
589
+ snapshot: SnapshotSpec,
590
+ logger: TrajectoryLogger,
591
+ episode_index: int,
592
+ seed: int | None,
593
+ ) -> None:
594
+ env = SyntheticRangeEnvironment(
595
+ randomize_flags=self._randomize_flags,
596
+ max_steps=self._max_steps,
597
+ )
598
+ try:
599
+ env.reset(
600
+ snapshot=snapshot,
601
+ episode_id=f"synth-{episode_index:04d}",
602
+ seed=seed,
603
+ )
604
+ active_snapshot = env.snapshot
605
+ if active_snapshot is None:
606
+ raise RuntimeError("Synthetic environment failed to load a snapshot")
607
+
608
+ task = active_snapshot.task
609
+ red_briefing = getattr(task, "red_briefing", "") or "Begin the assessment."
610
+ blue_briefing = getattr(task, "blue_briefing", "") or "Monitor the range."
611
+
612
+ self.red_agent.reset(briefing=red_briefing, role="red")
613
+ self.blue_agent.reset(briefing=blue_briefing, role="blue")
614
+
615
+ snapshot_id = active_snapshot.topology.get("snapshot_id", f"synth-{episode_index:04d}")
616
+ logger.start_episode(
617
+ episode_id=f"synth-{episode_index:04d}",
618
+ snapshot_id=snapshot_id,
619
+ tier=env.state.tier,
620
+ )
621
+
622
+ current_red_observation: str | RangeObservation = red_briefing
623
+ current_blue_observation: str | RangeObservation = blue_briefing
624
+ step = 0
625
+ done = False
626
+ last_obs: RangeObservation = RangeObservation(stdout=red_briefing)
627
+
628
+ while step < self._max_steps and not done:
629
+ red_cmd = self.red_agent.act(current_red_observation)
630
+ red_view = _observation_text(current_red_observation)
631
+ red_obs = env.step(RangeAction(command=red_cmd, mode="red"))
632
+ logger.log_turn(
633
+ role="red",
634
+ observation=red_view,
635
+ action=red_cmd,
636
+ reward=float(red_obs.reward or 0.0),
637
+ )
638
+ step += 1
639
+ last_obs = red_obs
640
+ done = bool(red_obs.done)
641
+ current_blue_observation = red_obs
642
+ if done or step >= self._max_steps:
643
+ break
644
+
645
+ blue_cmd = self.blue_agent.act(current_blue_observation)
646
+ blue_view = _observation_text(current_blue_observation)
647
+ blue_obs = env.step(RangeAction(command=blue_cmd, mode="blue"))
648
+ logger.log_turn(
649
+ role="blue",
650
+ observation=blue_view,
651
+ action=blue_cmd,
652
+ reward=float(blue_obs.reward or 0.0),
653
+ )
654
+ step += 1
655
+ last_obs = blue_obs
656
+ done = bool(blue_obs.done)
657
+ current_red_observation = blue_obs
658
+
659
+ state = env.state
660
+ outcome = self._episode_outcome(env)
661
+ logger.end_episode(
662
+ outcome=outcome,
663
+ metrics={
664
+ "steps": state.step_count,
665
+ "flags_found": len(state.flags_found),
666
+ "red_actions": len(env.red_history),
667
+ "blue_actions": len(env.blue_history),
668
+ "done": bool(last_obs.done),
669
+ },
670
+ )
671
+ finally:
672
+ env.close()
673
+
674
+ @staticmethod
675
+ def _episode_outcome(env: SyntheticRangeEnvironment) -> str:
676
+ if env.state.flags_found:
677
+ return "flag_captured"
678
+ if any(
679
+ record.get("type") == "finding" or record.get("cmd_name") == "submit_finding"
680
+ for record in env.blue_history
681
+ ):
682
+ return "blue_defended"
683
+ return "timeout"
684
+
685
+
686
+ def build_teacher_agents(
687
+ *,
688
+ teacher_model: str | None = None,
689
+ roles: tuple[str, ...] = ("red",),
690
+ red_model: str | None = None,
691
+ blue_model: str | None = None,
692
+ temperature: float | None = 0.2,
693
+ max_tokens: int = 512,
694
+ **litellm_kwargs: Any,
695
+ ) -> tuple[RangeAgent, RangeAgent]:
696
+ """Construct teacher agents for the selected roles, scripted fallbacks otherwise."""
697
+ if "red" in roles and (red_model or teacher_model):
698
+ red_agent: RangeAgent = LLMRangeAgent(
699
+ model=red_model or str(teacher_model),
700
+ temperature=temperature,
701
+ max_tokens=max_tokens,
702
+ **litellm_kwargs,
703
+ )
704
+ else:
705
+ red_agent = ScriptedRedAgent()
706
+
707
+ if "blue" in roles and (blue_model or teacher_model):
708
+ blue_agent: RangeAgent = LLMRangeAgent(
709
+ model=blue_model or str(teacher_model),
710
+ temperature=temperature,
711
+ max_tokens=max_tokens,
712
+ **litellm_kwargs,
713
+ )
714
+ else:
715
+ blue_agent = ScriptedBlueAgent()
716
+
717
+ return red_agent, blue_agent
src/open_range/training/trajectory.py CHANGED
@@ -130,7 +130,18 @@ class Episode:
130
  def to_jsonl_record(self, role: str) -> dict[str, Any]:
131
  """Build a single JSONL record for the given role."""
132
  reward = self.total_red_reward if role == "red" else self.total_blue_reward
133
- return {
 
 
 
 
 
 
 
 
 
 
 
134
  "episode_id": self.episode_id,
135
  "snapshot_id": self.snapshot_id,
136
  "tier": self.tier,
@@ -138,7 +149,15 @@ class Episode:
138
  "messages": self.to_chat_messages(role),
139
  "reward": round(reward, 4),
140
  "outcome": self.outcome,
 
141
  }
 
 
 
 
 
 
 
142
 
143
 
144
  # ---------------------------------------------------------------------------
@@ -285,28 +304,35 @@ class TrajectoryLogger:
285
  Returns:
286
  Number of JSONL lines written.
287
  """
 
288
  path = Path(path)
289
  path.parent.mkdir(parents=True, exist_ok=True)
290
 
291
- count = 0
292
  with open(path, "w") as f:
293
- for episode in self._episodes:
294
- for role in roles:
295
- # Check if this role had any turns
296
- role_turns = [t for t in episode.turns if t.role == role]
297
- if not role_turns:
298
- continue
299
-
300
- # Filter by reward threshold
301
- total_reward = sum(t.reward for t in role_turns)
302
- if total_reward < reward_threshold:
303
- continue
304
-
305
- record = episode.to_jsonl_record(role)
306
- f.write(json.dumps(record) + "\n")
307
- count += 1
308
-
309
- return count
 
 
 
 
 
 
 
310
 
311
  def clear(self) -> None:
312
  """Remove all recorded episodes."""
 
130
  def to_jsonl_record(self, role: str) -> dict[str, Any]:
131
  """Build a single JSONL record for the given role."""
132
  reward = self.total_red_reward if role == "red" else self.total_blue_reward
133
+ metadata = {
134
+ "source": self.metrics.get("source", "open_range.synthetic"),
135
+ "success": self.outcome == ("flag_captured" if role == "red" else "blue_defended"),
136
+ "snapshot_id": self.snapshot_id,
137
+ "tier": self.tier,
138
+ "role": role,
139
+ }
140
+ extra_metadata = self.metrics.get("metadata")
141
+ if isinstance(extra_metadata, dict):
142
+ metadata.update(extra_metadata)
143
+
144
+ record = {
145
  "episode_id": self.episode_id,
146
  "snapshot_id": self.snapshot_id,
147
  "tier": self.tier,
 
149
  "messages": self.to_chat_messages(role),
150
  "reward": round(reward, 4),
151
  "outcome": self.outcome,
152
+ "metadata": metadata,
153
  }
154
+ ground_truth_flags = self.metrics.get("ground_truth_flags")
155
+ if isinstance(ground_truth_flags, list):
156
+ record["ground_truth_flag"] = ground_truth_flags[0] if ground_truth_flags else None
157
+ else:
158
+ record["ground_truth_flag"] = None
159
+ record["optimal_steps"] = self.metrics.get("optimal_steps")
160
+ return record
161
 
162
 
163
  # ---------------------------------------------------------------------------
 
304
  Returns:
305
  Number of JSONL lines written.
306
  """
307
+ records = self.to_records(reward_threshold=reward_threshold, roles=roles)
308
  path = Path(path)
309
  path.parent.mkdir(parents=True, exist_ok=True)
310
 
 
311
  with open(path, "w") as f:
312
+ for record in records:
313
+ f.write(json.dumps(record) + "\n")
314
+
315
+ return len(records)
316
+
317
+ def to_records(
318
+ self,
319
+ reward_threshold: float = 0.0,
320
+ roles: tuple[str, ...] = ("red", "blue"),
321
+ ) -> list[dict[str, Any]]:
322
+ """Return JSONL-ready records without writing them to disk."""
323
+ records: list[dict[str, Any]] = []
324
+ for episode in self._episodes:
325
+ for role in roles:
326
+ role_turns = [t for t in episode.turns if t.role == role]
327
+ if not role_turns:
328
+ continue
329
+
330
+ total_reward = sum(t.reward for t in role_turns)
331
+ if total_reward < reward_threshold:
332
+ continue
333
+
334
+ records.append(episode.to_jsonl_record(role))
335
+ return records
336
 
337
  def clear(self) -> None:
338
  """Remove all recorded episodes."""
tests/test_synthetic.py ADDED
@@ -0,0 +1,232 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Tests for synthetic trajectory generation."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import json
6
+ import os
7
+ import sys
8
+ import types
9
+
10
+ import pytest
11
+ from click.testing import CliRunner
12
+
13
+ from open_range.agents.llm_agent import LLMRangeAgent
14
+ from open_range.agents.scripted_agent import ScriptedAgent, ScriptedBlueAgent, ScriptedRedAgent
15
+ from open_range.cli import cli
16
+ from open_range.server.models import RangeAction
17
+ from open_range.training.synthetic import (
18
+ SyntheticRangeEnvironment,
19
+ SyntheticTraceGenerator,
20
+ build_teacher_agents,
21
+ randomize_snapshot_flags,
22
+ )
23
+
24
+
25
+ class TestFlagRandomization:
26
+ def test_randomize_snapshot_flags_rewrites_all_string_references(self, sample_snapshot_spec):
27
+ snapshot = sample_snapshot_spec.model_copy(deep=True)
28
+ original_flag = snapshot.flags[0].value
29
+ snapshot.files["web:/tmp/flag.txt"] = f"echo {original_flag}"
30
+
31
+ randomized = randomize_snapshot_flags(snapshot, seed=7)
32
+ replaced_flag = randomized.flags[0].value
33
+
34
+ assert replaced_flag != original_flag
35
+ dumped = json.dumps(randomized.model_dump(mode="python"))
36
+ assert original_flag not in dumped
37
+ assert replaced_flag in dumped
38
+
39
+
40
+ class TestSyntheticEnvironment:
41
+ def test_synthetic_environment_simulates_red_and_blue_flow(self, sample_snapshot_spec):
42
+ env = SyntheticRangeEnvironment(randomize_flags=False, max_steps=5)
43
+
44
+ try:
45
+ reset_obs = env.reset(snapshot=sample_snapshot_spec, episode_id="synthetic-test")
46
+ assert "RED BRIEFING" in reset_obs.stdout
47
+
48
+ red_obs = env.step(
49
+ RangeAction(
50
+ command="curl 'http://web/search?q=test%27+UNION+SELECT+flag+FROM+flags--'",
51
+ mode="red",
52
+ )
53
+ )
54
+ assert "FLAG{test_sqli_123}" in red_obs.stdout
55
+
56
+ blue_obs = env.step(
57
+ RangeAction(
58
+ command="grep UNION /var/log/siem/consolidated/all.log",
59
+ mode="blue",
60
+ )
61
+ )
62
+ assert "SQLi pattern detected" in blue_obs.stdout
63
+
64
+ flag_obs = env.step(
65
+ RangeAction(command="submit_flag FLAG{test_sqli_123}", mode="red")
66
+ )
67
+ assert flag_obs.done is True
68
+ assert sample_snapshot_spec.flags[0].value in env.state.flags_found
69
+ finally:
70
+ env.close()
71
+
72
+
73
+ class TestSyntheticTraceGenerator:
74
+ def test_export_jsonl_records_selected_roles(self, tmp_path, sample_snapshot_spec):
75
+ red = ScriptedAgent(
76
+ commands=[
77
+ "nmap -sV 10.0.1.0/24",
78
+ "submit_flag FLAG{test_sqli_123}",
79
+ ]
80
+ )
81
+ blue = ScriptedAgent(commands=["grep scan /var/log/siem/consolidated/all.log"])
82
+ generator = SyntheticTraceGenerator(
83
+ snapshot=sample_snapshot_spec,
84
+ red_agent=red,
85
+ blue_agent=blue,
86
+ max_steps=3,
87
+ randomize_flags=False,
88
+ )
89
+
90
+ output_path = tmp_path / "synthetic.jsonl"
91
+ logger, count = generator.export_jsonl(
92
+ output_path,
93
+ num_traces=1,
94
+ roles=("red", "blue"),
95
+ )
96
+
97
+ assert count == 2
98
+ assert len(logger.episodes) == 1
99
+ assert logger.episodes[0].outcome == "flag_captured"
100
+
101
+ records = [json.loads(line) for line in output_path.read_text().splitlines()]
102
+ assert {record["role"] for record in records} == {"red", "blue"}
103
+ assert all(record["messages"][0]["role"] == "system" for record in records)
104
+
105
+ def test_build_teacher_agents_falls_back_to_scripted_when_no_model(self):
106
+ red, blue = build_teacher_agents(teacher_model=None, roles=("red", "blue"))
107
+ assert isinstance(red, ScriptedRedAgent)
108
+ assert isinstance(blue, ScriptedBlueAgent)
109
+
110
+
111
+ class TestLiteLLMSupport:
112
+ def test_codex_models_omit_temperature_and_enable_drop_params(self, monkeypatch):
113
+ captured: dict[str, object] = {}
114
+
115
+ def fake_completion(**kwargs):
116
+ captured.update(kwargs)
117
+ return types.SimpleNamespace(
118
+ choices=[
119
+ types.SimpleNamespace(
120
+ message=types.SimpleNamespace(content="```bash\nwhoami\n```")
121
+ )
122
+ ]
123
+ )
124
+
125
+ monkeypatch.setitem(sys.modules, "litellm", types.SimpleNamespace(completion=fake_completion))
126
+
127
+ agent = LLMRangeAgent(model="azure/gpt-5.2-codex", temperature=0.6, max_tokens=64)
128
+ agent.reset("Return exactly one command: whoami", "red")
129
+
130
+ assert agent.act("Return exactly one command: whoami") == "whoami"
131
+ assert captured["drop_params"] is True
132
+ assert "temperature" not in captured
133
+
134
+ def test_non_codex_models_keep_temperature(self, monkeypatch):
135
+ captured: dict[str, object] = {}
136
+
137
+ def fake_completion(**kwargs):
138
+ captured.update(kwargs)
139
+ return types.SimpleNamespace(
140
+ choices=[
141
+ types.SimpleNamespace(
142
+ message=types.SimpleNamespace(content="echo ok")
143
+ )
144
+ ]
145
+ )
146
+
147
+ monkeypatch.setitem(sys.modules, "litellm", types.SimpleNamespace(completion=fake_completion))
148
+
149
+ agent = LLMRangeAgent(model="openai/gpt-4o", temperature=0.4, max_tokens=32)
150
+ agent.reset("Return exactly one command: echo ok", "blue")
151
+
152
+ assert agent.act("Return exactly one command: echo ok") == "echo ok"
153
+ assert captured["temperature"] == 0.4
154
+ assert captured["drop_params"] is True
155
+
156
+
157
+ class TestSyntheticCLI:
158
+ def test_cli_generates_jsonl_from_snapshot(self, tmp_path, sample_snapshot_spec):
159
+ runner = CliRunner()
160
+ snapshot_path = tmp_path / "spec.json"
161
+ snapshot_path.write_text(json.dumps(sample_snapshot_spec.model_dump(mode="python")))
162
+ output_path = tmp_path / "synthetic.jsonl"
163
+
164
+ result = runner.invoke(
165
+ cli,
166
+ [
167
+ "synthetic-data",
168
+ "--snapshot",
169
+ str(snapshot_path),
170
+ "--output",
171
+ str(output_path),
172
+ "--num-traces",
173
+ "1",
174
+ "--max-steps",
175
+ "3",
176
+ "--roles",
177
+ "red",
178
+ "--reward-threshold",
179
+ "-1",
180
+ "--static-flags",
181
+ ],
182
+ )
183
+
184
+ assert result.exit_code == 0, result.output
185
+ assert output_path.exists()
186
+
187
+ records = [json.loads(line) for line in output_path.read_text().splitlines()]
188
+ assert len(records) == 1
189
+ assert records[0]["role"] == "red"
190
+
191
+
192
+ @pytest.mark.live_model
193
+ def test_live_model_smoke_generates_a_synthetic_trace(tmp_path, sample_snapshot_spec):
194
+ if not all(
195
+ os.environ.get(name)
196
+ for name in ("AZURE_API_KEY", "AZURE_API_BASE", "AZURE_API_VERSION")
197
+ ):
198
+ pytest.skip("Azure LiteLLM environment variables not configured")
199
+ pytest.importorskip("litellm")
200
+
201
+ snapshot = sample_snapshot_spec.model_copy(deep=True)
202
+ snapshot.task.red_briefing = (
203
+ "Synthetic smoke test. Return exactly one shell command. First command: whoami"
204
+ )
205
+ snapshot.task.blue_briefing = "Monitor the range."
206
+
207
+ red, blue = build_teacher_agents(
208
+ teacher_model=os.environ.get("OPENRANGE_SYNTH_MODEL", "azure/gpt-5.2-codex"),
209
+ roles=("red",),
210
+ temperature=None,
211
+ max_tokens=96,
212
+ )
213
+ generator = SyntheticTraceGenerator(
214
+ snapshot=snapshot,
215
+ red_agent=red,
216
+ blue_agent=blue,
217
+ max_steps=1,
218
+ randomize_flags=False,
219
+ )
220
+
221
+ output_path = tmp_path / "live_synthetic.jsonl"
222
+ logger, count = generator.export_jsonl(
223
+ output_path,
224
+ num_traces=1,
225
+ roles=("red",),
226
+ )
227
+
228
+ assert count == 1
229
+ assert len(logger.episodes) == 1
230
+ assert logger.episodes[0].red_turns
231
+ assert logger.episodes[0].red_turns[0].action.strip()
232
+ assert output_path.exists()
uv.lock CHANGED
@@ -1950,6 +1950,9 @@ dev = [
1950
  { name = "pytest" },
1951
  { name = "pytest-asyncio" },
1952
  ]
 
 
 
1953
  training = [
1954
  { name = "trl" },
1955
  { name = "unsloth" },
@@ -1963,6 +1966,7 @@ requires-dist = [
1963
  { name = "httpx", marker = "extra == 'dev'", specifier = ">=0.27" },
1964
  { name = "jinja2", specifier = ">=3.1" },
1965
  { name = "litellm", marker = "extra == 'builder'", specifier = ">=1.30" },
 
1966
  { name = "openenv-core", extras = ["core"], specifier = ">=0.2.1" },
1967
  { name = "pydantic", specifier = ">=2.0.0" },
1968
  { name = "pytest", marker = "extra == 'dev'", specifier = ">=8.0" },
@@ -1972,7 +1976,7 @@ requires-dist = [
1972
  { name = "unsloth", marker = "extra == 'training'" },
1973
  { name = "uvicorn", specifier = ">=0.24.0" },
1974
  ]
1975
- provides-extras = ["dev", "training", "builder"]
1976
 
1977
  [[package]]
1978
  name = "opentelemetry-api"
 
1950
  { name = "pytest" },
1951
  { name = "pytest-asyncio" },
1952
  ]
1953
+ synthetic = [
1954
+ { name = "litellm" },
1955
+ ]
1956
  training = [
1957
  { name = "trl" },
1958
  { name = "unsloth" },
 
1966
  { name = "httpx", marker = "extra == 'dev'", specifier = ">=0.27" },
1967
  { name = "jinja2", specifier = ">=3.1" },
1968
  { name = "litellm", marker = "extra == 'builder'", specifier = ">=1.30" },
1969
+ { name = "litellm", marker = "extra == 'synthetic'", specifier = ">=1.30" },
1970
  { name = "openenv-core", extras = ["core"], specifier = ">=0.2.1" },
1971
  { name = "pydantic", specifier = ">=2.0.0" },
1972
  { name = "pytest", marker = "extra == 'dev'", specifier = ">=8.0" },
 
1976
  { name = "unsloth", marker = "extra == 'training'" },
1977
  { name = "uvicorn", specifier = ">=0.24.0" },
1978
  ]
1979
+ provides-extras = ["dev", "training", "builder", "synthetic"]
1980
 
1981
  [[package]]
1982
  name = "opentelemetry-api"