Spaces:

abrown31
/

open-range

Runtime error

Aaron Brown commited on Mar 8

Commit

fb68239

1 Parent(s): a24d0f2

Add NPC actions, synthetic data pipeline, CLI enhancements

Agent-generated supporting changes from architecture review:
- NPC action system (actions.py) for structured NPC behaviors
- Synthetic data generation pipeline (training/synthetic.py)
- CLI improvements and NPC traffic script updates
- NPC agent loop enhancements for topology-driven operation

Files changed (15) hide show

README.md +12 -0
docs/red-blue-agents.md +24 -16
docs/synthetic-data.md +116 -0
pyproject.toml +4 -0
src/open_range/agents/llm_agent.py +12 -7
src/open_range/builder/npc/actions.py +306 -0
src/open_range/builder/npc/db_traffic.sh +10 -8
src/open_range/builder/npc/http_traffic.sh +12 -13
src/open_range/builder/npc/npc_agent.py +236 -102
src/open_range/cli.py +138 -0
src/open_range/training/__init__.py +15 -0
src/open_range/training/synthetic.py +717 -0
src/open_range/training/trajectory.py +45 -19
tests/test_synthetic.py +232 -0
uv.lock +5 -1

README.md CHANGED Viewed

@@ -50,6 +50,9 @@ uv sync
 # Optional: enable the LiteLLM-backed builder pipeline
 uv sync --extra builder
 # Optional: enable background refill inside the server
 export OPENRANGE_ENABLE_MANAGED_REFILL=1
 export OPENRANGE_RUNTIME_BUILDER=llm
@@ -57,6 +60,12 @@ export OPENRANGE_RUNTIME_BUILDER=llm
 # End-to-end demo (no Docker, no LLM)
 uv run python examples/demo.py
 # Run the OpenEnv client against a running server
 uv run python examples/remote_client_demo.py --base-url http://localhost:8000
@@ -97,6 +106,8 @@ The deployed package exposes the standard OpenEnv `reset()`, `step()`, and `stat
 **Agents** — Structural protocol: any object with `reset(briefing, role)` and `act(observation) -> command` works. Ships with `LLMRangeAgent` (litellm, any provider), `ScriptedAgent`, and `HumanAgent`.
 ```python
 from open_range.agents.episode import run_episode
 from open_range.agents.llm_agent import LLMRangeAgent
@@ -136,6 +147,7 @@ Compatible with `openenv` when installed; standalone FastAPI fallback otherwise.
 - [Architecture](docs/architecture.md) — full pipeline, network topology, episode lifecycle
 - [Builder & Validator](docs/builder-validator.md) — snapshot generation and admission
 - [Red & Blue Agents](docs/red-blue-agents.md) — tandem training, reward coupling, curriculum
 - [Agent Protocols](docs/agent-protocols.md) — agent interface, episode runner, evaluation
 - [OpenEnv Compliance](docs/openenv-compliance.md) — API contract, models, deployment

 # Optional: enable the LiteLLM-backed builder pipeline
 uv sync --extra builder
+# Optional: enable LiteLLM-backed synthetic teacher agents
+uv sync --extra synthetic
 # Optional: enable background refill inside the server
 export OPENRANGE_ENABLE_MANAGED_REFILL=1
 export OPENRANGE_RUNTIME_BUILDER=llm
 # End-to-end demo (no Docker, no LLM)
 uv run python examples/demo.py
+# Generate synthetic SFT traces from a snapshot or manifest
+uv run openrange synthetic-data \
+  --manifest manifests/tier1_basic.yaml \
+  --output data/sft_red.jsonl \
+  --roles red
 # Run the OpenEnv client against a running server
 uv run python examples/remote_client_demo.py --base-url http://localhost:8000
 **Agents** — Structural protocol: any object with `reset(briefing, role)` and `act(observation) -> command` works. Ships with `LLMRangeAgent` (litellm, any provider), `ScriptedAgent`, and `HumanAgent`.
+**Synthetic Data** — `open_range.training.synthetic` provides snapshot-grounded trajectory generation for SFT warm-start. It uses a fast simulated `RangeEnvironment`, optional LiteLLM teacher agents, per-episode flag randomization, and exports JSONL through `TrajectoryLogger`.
 ```python
 from open_range.agents.episode import run_episode
 from open_range.agents.llm_agent import LLMRangeAgent
 - [Architecture](docs/architecture.md) — full pipeline, network topology, episode lifecycle
 - [Builder & Validator](docs/builder-validator.md) — snapshot generation and admission
 - [Red & Blue Agents](docs/red-blue-agents.md) — tandem training, reward coupling, curriculum
+- [Synthetic Data](docs/synthetic-data.md) — snapshot-backed SFT trace generation with LiteLLM teachers
 - [Agent Protocols](docs/agent-protocols.md) — agent interface, episode runner, evaluation
 - [OpenEnv Compliance](docs/openenv-compliance.md) — API contract, models, deployment

docs/red-blue-agents.md CHANGED Viewed

@@ -500,26 +500,34 @@ Respond with a single shell command to execute. No explanation needed.
 ### SFT Data Generation (Implemented)
-Run episodes with frontier models to generate expert trajectories. The `TrajectoryLogger` (in `src/open_range/training/trajectory.py`) records turns and exports JSONL in OpenAI chat format.
 ```python
-from open_range.training.trajectory import TrajectoryLogger
-red = LLMRangeAgent(model="anthropic/claude-sonnet-4-20250514")
-blue = LLMRangeAgent(model="openai/gpt-4o")
-logger = TrajectoryLogger()
-for i in range(100):
-    logger.start_episode(f"ep-{i:03d}", snapshot_id="snap-001", tier=1)
-    result = run_episode(env, red, blue)
-    # log_turn() for each step, then:
-    logger.end_episode(outcome=result.outcome, metrics={"steps": result.steps})
-# Export as SFT JSONL -- each role is a separate training example
-# Only episodes above reward_threshold are included
-lines = logger.export_jsonl("sft_data.jsonl", reward_threshold=0.5)
 ```
 ### Asymmetric GRPO (Planned)
 Train one side via GRPO while the other plays as a fixed opponent:

 ### SFT Data Generation (Implemented)
+For synthetic warm-start data, prefer `SyntheticTraceGenerator` in `src/open_range/training/synthetic.py`. It keeps the data path aligned with OpenRange snapshots and rewards, but replaces Docker execution with a fast simulator so you can cheaply collect teacher trajectories.
 ```python
+from open_range.training import SyntheticTraceGenerator, build_teacher_agents
+red, blue = build_teacher_agents(
+    teacher_model="azure/gpt-5.2-codex",
+    roles=("red",),
+)
+generator = SyntheticTraceGenerator.from_manifest(
+    manifest=tier1_manifest,
+    red_agent=red,
+    blue_agent=blue,
+    template_only=True,
+    max_steps=8,
+)
+logger, lines = generator.export_jsonl(
+    "sft_data.jsonl",
+    num_traces=100,
+    reward_threshold=0.0,
+    roles=("red",),
+)
 ```
+For live Docker episodes or custom rollout loops, `TrajectoryLogger` still remains the low-level recorder and JSONL exporter.
 ### Asymmetric GRPO (Planned)
 Train one side via GRPO while the other plays as a fixed opponent:

docs/synthetic-data.md ADDED Viewed

	@@ -0,0 +1,116 @@

+# Synthetic Data
+OpenRange includes a snapshot-backed synthetic trajectory generator for SFT warm-start and offline data collection. The design is influenced by Open Trajectory Gym's split between world specification, executor, and teacher model, but it is implemented in the OpenRange training layer so it stays aligned with the existing `SnapshotSpec`, `RangeEnvironment`, and `TrajectoryLogger` types.
+## Why It Lives In `training/`
+Synthetic trace generation is a training concern, not a runtime concern:
+- The live server still owns real `reset()` / `step()` episodes on Docker infrastructure.
+- Synthetic generation reuses the same `SnapshotSpec` and reward/meta-command semantics, but swaps Docker execution for a fast simulator.
+- Export still goes through `TrajectoryLogger`, so downstream SFT JSONL format does not fork.
+This keeps OpenRange's real environment and synthetic data path close enough to share prompts, actions, and episode structure without turning the production server into a data-generation service.
+## Components
+- `SyntheticRangeEnvironment`: a fast `RangeEnvironment` subclass that simulates common Red and Blue commands from a loaded snapshot.
+- `SyntheticTraceGenerator`: drives Red and Blue agents through synthetic episodes and records them with `TrajectoryLogger`.
+- `build_teacher_agents()`: constructs LiteLLM-backed teacher agents for selected roles and scripted fallbacks for the rest.
+- `randomize_snapshot_flags()`: clones a snapshot and rewrites flag values per episode so traces do not memorize static flag strings.
+## LiteLLM Support
+Install the optional dependency:
+```bash
+uv sync --extra synthetic
+```
+Any LiteLLM model string supported by `LLMRangeAgent` works. For Azure OpenAI, export the usual LiteLLM/Azure variables and pass the deployment name as the model:
+```bash
+export AZURE_API_KEY=...
+export AZURE_API_BASE=...
+export AZURE_API_VERSION=...
+uv run openrange synthetic-data \
+  --manifest manifests/tier1_basic.yaml \
+  --output data/sft_red.jsonl \
+  --roles red \
+  --teacher-model azure/gpt-5.2-codex
+```
+Codex-style Azure deployments often reject `temperature`; `LLMRangeAgent` now omits it automatically for model names containing `codex`.
+## CLI
+Generate traces from an existing snapshot:
+```bash
+uv run openrange synthetic-data \
+  --snapshot snapshots/spec.json \
+  --output data/sft_red.jsonl \
+  --num-traces 25 \
+  --roles red
+```
+Generate traces from a manifest using the deterministic builder:
+```bash
+uv run openrange synthetic-data \
+  --manifest manifests/tier1_basic.yaml \
+  --output data/sft_red_blue.jsonl \
+  --roles red,blue \
+  --num-traces 50
+```
+Generate traces from a manifest using both an LLM builder and LLM teachers:
+```bash
+uv run openrange synthetic-data \
+  --manifest manifests/tier1_basic.yaml \
+  --llm-builder \
+  --builder-model azure/gpt-5.2-codex \
+  --teacher-model azure/gpt-5.2-codex \
+  --roles red \
+  --output data/frontier_red.jsonl
+```
+## Python API
+```python
+from open_range.training import SyntheticTraceGenerator, build_teacher_agents
+red, blue = build_teacher_agents(
+    teacher_model="azure/gpt-5.2-codex",
+    roles=("red",),
+    max_tokens=256,
+)
+generator = SyntheticTraceGenerator.from_manifest(
+    manifest=tier1_manifest,
+    red_agent=red,
+    blue_agent=blue,
+    template_only=True,
+    max_steps=8,
+)
+logger, lines = generator.export_jsonl(
+    "data/sft_red.jsonl",
+    num_traces=10,
+    roles=("red",),
+)
+```
+## Testing
+Unit coverage lives in `tests/test_synthetic.py`.
+There is also a gated live-model smoke test that exercises the synthetic generator against a real LiteLLM model:
+```bash
+uv run --extra synthetic pytest tests/test_synthetic.py -m live_model -q
+```
+The live test is skipped automatically unless the required Azure environment variables are present.

pyproject.toml CHANGED Viewed

@@ -23,6 +23,7 @@ dependencies = [
 dev = ["pytest>=8.0", "pytest-asyncio>=0.23", "httpx>=0.27"]
 training = ["trl>=0.8", "unsloth"]
 builder = ["litellm>=1.30"]
 [project.scripts]
 openrange = "open_range.cli:cli"
@@ -45,3 +46,6 @@ package-data = { "open_range" = ["**/*.yaml", "**/*.yml"] }
 [tool.pytest.ini_options]
 asyncio_mode = "auto"

 dev = ["pytest>=8.0", "pytest-asyncio>=0.23", "httpx>=0.27"]
 training = ["trl>=0.8", "unsloth"]
 builder = ["litellm>=1.30"]
+synthetic = ["litellm>=1.30"]
 [project.scripts]
 openrange = "open_range.cli:cli"
 [tool.pytest.ini_options]
 asyncio_mode = "auto"
+markers = [
+    "live_model: runs live LiteLLM model smoke tests",
+]

src/open_range/agents/llm_agent.py CHANGED Viewed

@@ -32,7 +32,7 @@ class LLMRangeAgent:
     def __init__(
         self,
         model: str = "anthropic/claude-sonnet-4-20250514",
-        temperature: float = 0.3,
         max_tokens: int = 512,
         **litellm_kwargs: Any,
     ) -> None:
@@ -67,13 +67,18 @@ class LLMRangeAgent:
         if self.messages and self.messages[-1]["role"] != "user":
             self.messages.append({"role": "user", "content": observation_text})
-        response = litellm.completion(
-            model=self.model,
-            messages=self.messages,
-            temperature=self.temperature,
-            max_tokens=self.max_tokens,
             **self.litellm_kwargs,
-        )
         text = response.choices[0].message.content.strip()
         self.messages.append({"role": "assistant", "content": text})

     def __init__(
         self,
         model: str = "anthropic/claude-sonnet-4-20250514",
+        temperature: float | None = 0.3,
         max_tokens: int = 512,
         **litellm_kwargs: Any,
     ) -> None:
         if self.messages and self.messages[-1]["role"] != "user":
             self.messages.append({"role": "user", "content": observation_text})
+        kwargs: dict[str, Any] = {
+            "model": self.model,
+            "messages": self.messages,
+            "max_tokens": self.max_tokens,
+            "drop_params": True,
             **self.litellm_kwargs,
+        }
+        # Codex deployments commonly reject temperature; omit it when unsupported.
+        if self.temperature is not None and "codex" not in self.model.lower():
+            kwargs["temperature"] = self.temperature
+        response = litellm.completion(**kwargs)
         text = response.choices[0].message.content.strip()
         self.messages.append({"role": "assistant", "content": text})

src/open_range/builder/npc/actions.py ADDED Viewed

	@@ -0,0 +1,306 @@

+"""NPC action executor -- bridges NPC decisions to container state changes.
+All actions are derived from the SnapshotSpec at init time, so they adapt
+to whatever environment the Builder LLM generated.  No hardcoded pages,
+tables, or endpoints.
+"""
+from __future__ import annotations
+import logging
+import re
+import time
+from typing import Any
+from open_range.protocols import ContainerSet, NPCAction, NPCPersona, SnapshotSpec
+logger = logging.getLogger(__name__)
+class NPCActionExecutor:
+    """Execute NPC actions inside Docker containers.
+    At init, extracts available pages, shares, DB tables, and users from
+    the snapshot so every action targets real resources in this environment.
+    """
+    def __init__(self, containers: ContainerSet, snapshot: SnapshotSpec) -> None:
+        self.containers = containers
+        # Derive available targets from the snapshot
+        self._pages = _extract_web_pages(snapshot)
+        self._shares = _extract_shares(snapshot)
+        self._db_tables = _extract_db_tables(snapshot)
+        self._users = _extract_users(snapshot)
+        self._domain = snapshot.topology.get("domain", "corp.local")
+    # ------------------------------------------------------------------
+    # Routine actions (autonomous workday)
+    # ------------------------------------------------------------------
+    async def execute_routine(
+        self,
+        persona: NPCPersona,
+        action: str,
+        target: str,
+        detail: str,
+        email_body: str = "",
+    ) -> dict[str, Any]:
+        """Execute an autonomous work action derived from the snapshot."""
+        username = _username_from_persona(persona)
+        handler = {
+            "browse": self._routine_browse,
+            "send_email": self._routine_email,
+            "lookup": self._routine_lookup,
+            "access_share": self._routine_share,
+            "login": self._routine_login,
+            "query_db": self._routine_query_db,
+            "idle": self._routine_idle,
+        }.get(action, self._routine_idle)
+        return await handler(persona, username, target, detail, email_body)
+    async def _routine_browse(self, persona, username, target, detail, _eb):
+        """Browse a page that exists in this snapshot."""
+        path = target if target.startswith("/") else f"/{target}" if target else "/"
+        # Fall back to a known page if target isn't in snapshot
+        if path == "/" and self._pages:
+            import random
+            path = random.choice(self._pages)
+        await self.containers.exec(
+            "web",
+            f'curl -s -o /dev/null -A "Mozilla/5.0 ({username})" "http://localhost{path}"',
+        )
+        return _log(persona, "browse", detail or f"Browsed {path}", f"web:{path}")
+    async def _routine_email(self, persona, username, target, detail, body):
+        """Send email to a colleague (picks a real user from topology)."""
+        import random
+        recipient = target
+        if not recipient and self._users:
+            recipient = random.choice(self._users)
+        elif not recipient:
+            recipient = "colleague"
+        ts_i = int(time.time())
+        content = body or f"Hi {recipient}, quick update: {detail or 'checking in'}."
+        msg = (
+            f"From: {username}@{self._domain}\\n"
+            f"To: {recipient}@{self._domain}\\n"
+            f"Subject: {detail or 'Update'}\\n\\n{content}"
+        )
+        await self.containers.exec(
+            "mail",
+            f"mkdir -p /var/mail/{username} "
+            f"&& echo '{msg}' > /var/mail/{username}/sent_{ts_i}.eml",
+        )
+        return _log(persona, "send_email", detail or f"Emailed {recipient}", f"mail:{username}")
+    async def _routine_lookup(self, persona, username, target, detail, _eb):
+        """Look up data on the web app -- uses whatever search/lookup page exists."""
+        # Find a page with query params in the snapshot
+        lookup_pages = [p for p in self._pages if "?" in p or "lookup" in p or "search" in p]
+        if lookup_pages:
+            import random
+            page = random.choice(lookup_pages)
+        elif self._pages:
+            import random
+            page = random.choice(self._pages) + "?q=" + (target or "status")
+        else:
+            page = f"/?q={target or 'data'}"
+        await self.containers.exec(
+            "web",
+            f'curl -s -o /dev/null -A "Mozilla/5.0 ({username})" "http://localhost{page}"',
+        )
+        return _log(persona, "lookup", detail or f"Searched: {target}", f"web:{page}")
+    async def _routine_share(self, persona, username, target, detail, _eb):
+        """Access a file share that exists in this snapshot."""
+        import random
+        share = target or (random.choice(self._shares) if self._shares else "general")
+        await self.containers.exec(
+            "files",
+            f"ls /srv/shares/{share}/ 2>/dev/null || true",
+        )
+        return _log(persona, "access_share", detail or f"Browsed {share} share", f"files:{share}")
+    async def _routine_login(self, persona, username, target, detail, _eb):
+        """Log into the web portal."""
+        # Find the login page from snapshot
+        login_pages = [p for p in self._pages if "login" in p or "index" in p]
+        page = login_pages[0] if login_pages else "/"
+        await self.containers.exec(
+            "web",
+            f'curl -s -o /dev/null -A "Mozilla/5.0 ({username})" '
+            f'-d "username={username}&password=placeholder" '
+            f'"http://localhost{page}"',
+        )
+        return _log(persona, "login", detail or "Portal login", "web:access_log")
+    async def _routine_query_db(self, persona, username, target, detail, _eb):
+        """Query the database -- uses tables that exist in this snapshot."""
+        import random
+        if self._db_tables:
+            table = random.choice(self._db_tables)
+            query = f"SELECT * FROM {table} LIMIT 5"
+        else:
+            query = "SHOW TABLES"
+        await self.containers.exec(
+            "db",
+            f'mysql -u app_user -p\'AppUs3r!2024\' -e "{query}" 2>/dev/null || true',
+        )
+        return _log(persona, "query_db", detail or f"Queried {target or 'database'}", "db:query_log")
+    async def _routine_idle(self, persona, username, target, detail, _eb):
+        return _log(persona, "idle", detail or "Away from desk", "none")
+    # ------------------------------------------------------------------
+    # Reactive actions (response to stimuli from Red)
+    # ------------------------------------------------------------------
+    async def execute(self, persona: NPCPersona, action: NPCAction) -> dict[str, Any]:
+        """Execute a reactive NPC action (response to stimulus)."""
+        handler = {
+            "click_link": self._react_click,
+            "open_attachment": self._react_click,
+            "reply": self._react_email,
+            "forward": self._react_email,
+            "share_credentials": self._react_share_creds,
+            "report_to_IT": self._react_report,
+            "ignore": self._react_ignore,
+        }.get(action.action, self._react_ignore)
+        return await handler(persona, action)
+    async def _react_click(self, persona: NPCPersona, action: NPCAction) -> dict[str, Any]:
+        url = "http://localhost/"
+        for effect in action.side_effects:
+            urls = re.findall(r"https?://\S+", effect)
+            if urls:
+                url = urls[0].rstrip(".")
+                break
+        username = _username_from_persona(persona)
+        await self.containers.exec(
+            "web",
+            f'curl -s -o /dev/null -A "Mozilla/5.0 ({username})" "{url}"',
+        )
+        return _log(persona, "click_link", f"Clicked: {url}", "web:access_log")
+    async def _react_email(self, persona: NPCPersona, action: NPCAction) -> dict[str, Any]:
+        username = _username_from_persona(persona)
+        ts_i = int(time.time())
+        body = (action.response_content or "acknowledged")[:500]
+        await self.containers.exec(
+            "mail",
+            f"mkdir -p /var/mail/{username} "
+            f"&& echo 'From: {username}@{self._domain}\\nSubject: Re\\n\\n{body}' "
+            f"> /var/mail/{username}/sent_{ts_i}.eml",
+        )
+        return _log(persona, action.action, "Replied to message", "mail:spool")
+    async def _react_share_creds(self, persona: NPCPersona, action: NPCAction) -> dict[str, Any]:
+        username = _username_from_persona(persona)
+        content = action.response_content or f"username: {username}"
+        ts_i = int(time.time())
+        # Leaked creds file
+        await self.containers.exec("web", f"echo '{content}' >> /tmp/leaked_{ts_i}.txt")
+        # Suspicious login
+        await self.containers.exec(
+            "web",
+            f'curl -s -o /dev/null -A "Mozilla/5.0 (external)" '
+            f'-d "username={username}&password=leaked" "http://localhost/"',
+        )
+        # SIEM alert
+        await self.containers.exec(
+            "siem",
+            f'echo "[$(date)] CRED-LEAK: {persona.name} shared credentials" '
+            f">> /var/log/siem/consolidated/all.log",
+        )
+        return _log(persona, "share_credentials", f"{persona.name} leaked credentials", "web+siem")
+    async def _react_report(self, persona: NPCPersona, action: NPCAction) -> dict[str, Any]:
+        detail = "; ".join(action.side_effects) if action.side_effects else "suspicious activity"
+        await self.containers.exec(
+            "siem",
+            f'echo "[$(date)] NPC-REPORT: {persona.name}: {detail}" '
+            f">> /var/log/siem/consolidated/all.log",
+        )
+        return _log(persona, "report_to_IT", detail, "siem:alert")
+    async def _react_ignore(self, persona: NPCPersona, action: NPCAction) -> dict[str, Any]:
+        return _log(persona, "ignore", "Ignored stimulus", "none")
+# ---------------------------------------------------------------------------
+# Snapshot introspection -- derive available targets from the generated env
+# ---------------------------------------------------------------------------
+def _extract_web_pages(snapshot: SnapshotSpec) -> list[str]:
+    """Extract URL paths from snapshot files dict (web:*.php -> /path)."""
+    pages: list[str] = []
+    for key in snapshot.files:
+        if not key.startswith("web:"):
+            continue
+        path = key.split(":", 1)[1]
+        # Convert filesystem path to URL path
+        if "/var/www/" in path and path.endswith(".php"):
+            url_path = path.replace("/var/www/portal", "").replace("/var/www/html", "")
+            if url_path:
+                pages.append(url_path)
+    return pages or ["/"]
+def _extract_shares(snapshot: SnapshotSpec) -> list[str]:
+    """Extract Samba share names from snapshot files dict."""
+    shares: set[str] = set()
+    for key in snapshot.files:
+        if not key.startswith("files:"):
+            continue
+        path = key.split(":", 1)[1]
+        # /srv/shares/<share_name>/file.txt -> share_name
+        if "/srv/shares/" in path:
+            parts = path.split("/srv/shares/")[1].split("/")
+            if parts:
+                shares.add(parts[0])
+    return list(shares) or ["general"]
+def _extract_db_tables(snapshot: SnapshotSpec) -> list[str]:
+    """Extract table names from SQL in the snapshot files dict."""
+    tables: set[str] = set()
+    for key, content in snapshot.files.items():
+        if key != "db:sql":
+            continue
+        # Find table names from INSERT INTO / SELECT FROM statements
+        for match in re.finditer(r"(?:INSERT INTO|FROM|UPDATE)\s+(\w+\.?\w*)", content, re.IGNORECASE):
+            table = match.group(1)
+            # Skip system tables
+            if table.lower() not in ("information_schema", "mysql", "performance_schema"):
+                tables.add(table)
+    return list(tables) or []
+def _extract_users(snapshot: SnapshotSpec) -> list[str]:
+    """Extract usernames from topology."""
+    users = snapshot.topology.get("users", [])
+    return [u["username"] for u in users if isinstance(u, dict) and "username" in u]
+def _username_from_persona(persona: NPCPersona) -> str:
+    email = persona.accounts.get("email", "")
+    if "@" in email:
+        return email.split("@")[0]
+    return persona.name.lower().split()[0]
+def _log(persona: NPCPersona, action: str, detail: str, source: str) -> dict[str, Any]:
+    return {
+        "timestamp": time.time(),
+        "type": f"npc_{action}",
+        "persona": persona.name,
+        "department": persona.department,
+        "action": action,
+        "detail": detail,
+        "source": source,
+    }

src/open_range/builder/npc/db_traffic.sh CHANGED Viewed

@@ -15,16 +15,18 @@ RATE_LAMBDA="${RATE_LAMBDA:-20}"
 INTERVAL=$(awk "BEGIN {printf \"%.1f\", 60.0 / $RATE_LAMBDA}")
 # Application-level queries that a normal app would run
 QUERIES=(
-    "SELECT id, username FROM app.users LIMIT 5"
-    "SELECT name, price FROM app.products ORDER BY RAND() LIMIT 3"
-    "SELECT COUNT(*) FROM app.sessions WHERE active=1"
-    "INSERT INTO app.access_log (user_id, page, ts) VALUES (1, '/dashboard', NOW())"
-    "SELECT * FROM app.products WHERE category='electronics'"
-    "UPDATE app.sessions SET last_seen=NOW() WHERE user_id=1"
-    "SELECT username, last_login FROM app.users WHERE last_login > DATE_SUB(NOW(), INTERVAL 1 HOUR)"
-    "SELECT page, COUNT(*) AS hits FROM app.access_log GROUP BY page ORDER BY hits DESC LIMIT 5"
 )
 # App database credentials (non-privileged)

 INTERVAL=$(awk "BEGIN {printf \"%.1f\", 60.0 / $RATE_LAMBDA}")
+DB_NAME="referral_db"
 # Application-level queries that a normal app would run
 QUERIES=(
+    "SELECT id, first_name, last_name FROM ${DB_NAME}.patients LIMIT 5"
+    "SELECT id, status, specialist FROM ${DB_NAME}.patient_referrals ORDER BY created_at DESC LIMIT 3"
+    "SELECT COUNT(*) FROM ${DB_NAME}.patient_referrals WHERE status='Pending'"
+    "SELECT id, amount_due, status FROM ${DB_NAME}.billing WHERE status='Open'"
+    "SELECT username, role, department FROM ${DB_NAME}.users LIMIT 10"
+    "UPDATE ${DB_NAME}.billing SET last_updated=CURDATE() WHERE id=5001"
+    "SELECT p.first_name, p.last_name, r.status FROM ${DB_NAME}.patients p JOIN ${DB_NAME}.patient_referrals r ON p.id=r.patient_id LIMIT 5"
+    "INSERT INTO ${DB_NAME}.access_log (user_id, action, ip) VALUES (3, 'view_referrals', '10.0.1.10')"
 )
 # App database credentials (non-privileged)

src/open_range/builder/npc/http_traffic.sh CHANGED Viewed

@@ -20,23 +20,22 @@ INTERVAL=$(awk "BEGIN {printf \"%.1f\", 60.0 / $RATE_LAMBDA}")
 PAGES=(
     "/"
     "/index.php"
-    "/about"
-    "/products"
-    "/contact"
-    "/login"
-    "/api/users/1/profile"
-    "/search?q=quarterly+report"
-    "/search?q=product+catalog"
-    "/download?file=report.pdf"
-    "/dashboard"
-    "/help"
 )
 # Common form submissions
 FORMS=(
-    "-d 'username=jsmith&password=Welcome1' http://${WEB_HOST}/login"
-    "-d 'q=marketing+materials' http://${WEB_HOST}/search"
-    "-d 'name=Janet&email=jsmith@acmecorp.local&message=Hello' http://${WEB_HOST}/contact"
 )
 echo "[NPC-HTTP] Starting HTTP traffic to ${WEB_HOST} at ${RATE_LAMBDA} req/min"

 PAGES=(
     "/"
     "/index.php"
+    "/dashboard.php"
+    "/lookup.php?last=Smith"
+    "/lookup.php?last=Hart"
+    "/lookup.php?last=Bishop"
+    "/referral_submit.php"
+    "/api/referral_status.php?id=1001"
+    "/api/referral_status.php?id=1002"
+    "/admin/compliance_report.php"
+    "/logout.php"
 )
 # Common form submissions
 FORMS=(
+    "-d 'username=mgarcia&password=Welcome2024!' http://${WEB_HOST}/index.php"
+    "-d 'username=kwilliams&password=Welcome2024!' http://${WEB_HOST}/index.php"
+    "-d 'patient_id=1&clinic=Northside&specialist=Dr.Patel&diagnosis=Cardiology' http://${WEB_HOST}/referral_submit.php"
 )
 echo "[NPC-HTTP] Starting HTTP traffic to ${WEB_HOST} at ${RATE_LAMBDA} req/min"

src/open_range/builder/npc/npc_agent.py CHANGED Viewed

@@ -1,8 +1,10 @@
 """LLM-driven NPC agent (Level 1).
-Each NPC has a persona card and polls for incoming stimuli (emails, chat
-messages) on a configurable interval. The agent decides how to respond
-using an LLM call via LiteLLM.
 """
 from __future__ import annotations
@@ -11,40 +13,61 @@ import asyncio
 import json
 import logging
 import os
 from typing import Any
 import litellm
-from open_range.protocols import ContainerSet, NPCAction, NPCPersona, Stimulus
 logger = logging.getLogger(__name__)
-NPC_SYSTEM_PROMPT = """\
-You are simulating an employee in a corporate environment. You will receive \
-your persona card and an incoming stimulus (email, chat message, etc.).
-Based on your persona's security_awareness and susceptibility profile, decide \
-how to respond. You must stay in character.
-Return ONLY valid JSON:
 {
-  "action": "<click_link|open_attachment|reply|share_credentials|ignore|report_to_IT|forward>",
-  "response_content": "<your reply text if action is reply/forward, empty otherwise>",
-  "side_effects": ["<description of side effect>"]
 }
 Guidelines:
-- High security_awareness (>0.7): suspicious of unusual requests, verify sender, \
-  report phishing attempts.
-- Low security_awareness (<0.3): trusting, clicks links readily, may share \
-  credentials if asked politely.
-- Always consider the stimulus plausibility and your susceptibility profile.
-- Never reveal that you are an AI or break character.
 """
 class LLMNPCAgent:
-    """Async LLM NPC agent that responds to stimuli based on persona."""
     def __init__(
         self,
@@ -52,137 +75,248 @@ class LLMNPCAgent:
         temperature: float = 0.3,
     ) -> None:
         self.model = model or os.environ.get(
-            "OPENRANGE_NPC_MODEL", "anthropic/claude-haiku-4-5-20251001"
         )
-        self.temperature = temperature
-    async def decide(
-        self,
-        persona: NPCPersona,
-        stimulus: Stimulus,
-    ) -> NPCAction:
-        """Decide how an NPC responds to a stimulus via LLM.
-        This satisfies the NPCBehavior protocol.
-        """
         try:
-            response = await litellm.acompletion(
-                model=self.model,
-                messages=[
-                    {"role": "system", "content": NPC_SYSTEM_PROMPT},
-                    {
-                        "role": "user",
-                        "content": json.dumps(
-                            {
-                                "persona": persona.model_dump(),
-                                "stimulus": stimulus.model_dump(),
-                            }
-                        ),
-                    },
-                ],
-                response_format={"type": "json_object"},
-                temperature=self.temperature,
             )
             raw = json.loads(response.choices[0].message.content)
             return NPCAction(
                 action=raw.get("action", "ignore"),
                 response_content=raw.get("response_content", ""),
                 side_effects=raw.get("side_effects", []),
             )
         except Exception as exc:
-            logger.warning(
-                "NPC %s LLM decision failed, defaulting to ignore: %s",
-                persona.name,
-                exc,
-            )
             return NPCAction(action="ignore")
     async def run_loop(
         self,
         persona: NPCPersona,
         containers: ContainerSet,
     ) -> None:
-        """Run the NPC agent loop, polling for stimuli on the persona's schedule.
-        This loop runs as an asyncio task, checking for incoming emails
-        and processing them according to the persona's schedule.
         """
-        interval = persona.routine.get("email_check_interval_min", 15)
-        interval_s = interval * 60
         logger.info(
-            "NPC %s starting loop (check every %d min)",
-            persona.name,
-            interval,
         )
         while True:
             try:
-                await asyncio.sleep(interval_s)
-                # In a full implementation, this would:
-                # 1. docker exec into mail container to check persona's mailbox
-                # 2. Parse new emails into Stimulus objects
-                # 3. Call self.decide() for each stimulus
-                # 4. Execute side effects (click links, reply, etc.)
-                #
-                # For now, the loop just keeps the task alive.
-                # The actual stimulus injection happens when Red sends
-                # phishing emails via the environment's step() method.
-                logger.debug("NPC %s checked mailbox (no new stimuli)", persona.name)
             except asyncio.CancelledError:
-                logger.info("NPC %s loop cancelled", persona.name)
                 break
             except Exception as exc:
                 logger.warning("NPC %s loop error: %s", persona.name, exc)
-                await asyncio.sleep(30)  # back off on error
 class NullNPCBehavior:
-    """No-op NPC behavior for Level 0 (shell scripts handle everything)."""
-    async def decide(
-        self,
-        persona: NPCPersona,
-        stimulus: Stimulus,
-    ) -> NPCAction:
-        """Always ignore -- Level 0 NPCs don't process stimuli."""
         return NPCAction(action="ignore")
 class RuleBasedNPCBehavior:
-    """Heuristic NPC decisions based on susceptibility scores. No LLM calls."""
-    async def decide(
-        self,
-        persona: NPCPersona,
-        stimulus: Stimulus,
-    ) -> NPCAction:
-        """Decide based on persona susceptibility and stimulus plausibility."""
-        # Get the susceptibility score for this stimulus type
         susceptibility = persona.susceptibility.get(
-            f"{stimulus.type}", persona.susceptibility.get("phishing_email", 0.5)
         )
         score = stimulus.plausibility * susceptibility
         if persona.security_awareness > 0.7 and score < 0.8:
-            return NPCAction(
-                action="report_to_IT",
-                side_effects=["reported suspicious email to IT"],
-            )
         elif score > 0.6:
-            return NPCAction(
-                action="click_link",
-                side_effects=["clicked link in email"],
-            )
         elif score > 0.3:
             return NPCAction(action="ignore")
         else:
-            return NPCAction(
-                action="report_to_IT",
-                side_effects=["forwarded suspicious email to security team"],
-            )

 """LLM-driven NPC agent (Level 1).
+Each NPC autonomously lives their workday -- browsing pages, emailing
+colleagues, querying records, accessing shares.  Available actions are
+derived from the SnapshotSpec so they adapt to whatever environment the
+Builder LLM generated.  NPCs also react to incoming stimuli (phishing,
+social engineering) based on their security_awareness profile.
 """
 from __future__ import annotations
 import json
 import logging
 import os
+import random
+import time
 from typing import Any
 import litellm
+from open_range.protocols import ContainerSet, NPCAction, NPCPersona, SnapshotSpec, Stimulus
 logger = logging.getLogger(__name__)
+# ---------------------------------------------------------------------------
+# Prompts
+# ---------------------------------------------------------------------------
+NPC_ROUTINE_PROMPT = """\
+You are simulating a corporate employee going about their normal workday. \
+You will receive your persona, the available resources in this environment, \
+and your recent activity.
+Pick the NEXT natural work action. Stay in character for your role.
+Return valid JSON:
 {
+  "action": "<browse|send_email|lookup|access_share|login|query_db|idle>",
+  "target": "<a specific target from the available resources>",
+  "detail": "<1-sentence description of what you're doing and why>",
+  "email_body": "<if send_email, the message; otherwise empty>"
 }
 Guidelines:
+- Pick targets from the AVAILABLE RESOURCES provided (real pages, shares, users)
+- Vary your actions -- don't repeat the same thing twice in a row
+- Be mundane and realistic: checking records, status updates, reviewing docs
+- Your role determines what you'd naturally do (clinical staff look up records, \
+  admin checks billing, IT reviews systems, executives read reports)
+"""
+NPC_REACT_PROMPT = """\
+You are simulating an employee who received an unexpected message. \
+Based on your security_awareness and susceptibility, decide how to respond. \
+Stay in character. Return valid JSON:
+{
+  "action": "<click_link|open_attachment|reply|share_credentials|ignore|report_to_IT|forward>",
+  "response_content": "<your reply text if replying, otherwise empty>",
+  "side_effects": ["<what happens as a result>"]
+}
+- security_awareness > 0.7: verify sender, check URLs, report suspicious messages
+- security_awareness < 0.3: trusting, clicks links, may share credentials if asked
 """
 class LLMNPCAgent:
+    """Async NPC agent that autonomously lives its workday via LLM."""
     def __init__(
         self,
         temperature: float = 0.3,
     ) -> None:
         self.model = model or os.environ.get(
+            "OPENRANGE_NPC_MODEL", "azure/gpt-5.2-codex"
         )
+        if "codex" in self.model.lower():
+            self.temperature: float | None = None
+        else:
+            self.temperature = temperature
+        self._actions: list[dict[str, Any]] = []
+    def get_actions(self) -> list[dict[str, Any]]:
+        """Return all recorded NPC actions for SIEM consumption."""
+        return list(self._actions)
+    # ------------------------------------------------------------------
+    # Reactive: respond to external stimulus
+    # ------------------------------------------------------------------
+    async def decide(self, persona: NPCPersona, stimulus: Stimulus) -> NPCAction:
+        """Decide how to respond to a stimulus (NPCBehavior protocol)."""
         try:
+            user_payload = (
+                "Respond as this NPC employee in valid JSON.\n\n"
+                + json.dumps({
+                    "persona": persona.model_dump(),
+                    "stimulus": stimulus.model_dump(),
+                })
             )
+            kwargs: dict[str, Any] = {
+                "model": self.model,
+                "messages": [
+                    {"role": "system", "content": NPC_REACT_PROMPT},
+                    {"role": "user", "content": user_payload},
+                ],
+                "response_format": {"type": "json_object"},
+            }
+            if self.temperature is not None:
+                kwargs["temperature"] = self.temperature
+            response = await litellm.acompletion(**kwargs)
             raw = json.loads(response.choices[0].message.content)
             return NPCAction(
                 action=raw.get("action", "ignore"),
                 response_content=raw.get("response_content", ""),
                 side_effects=raw.get("side_effects", []),
             )
         except Exception as exc:
+            logger.warning("NPC %s react failed: %s", persona.name, exc)
             return NPCAction(action="ignore")
+    # ------------------------------------------------------------------
+    # Proactive: what to do next at work (derived from snapshot)
+    # ------------------------------------------------------------------
+    async def next_routine_action(
+        self, persona: NPCPersona, env_context: dict[str, Any],
+    ) -> dict[str, str]:
+        """Ask LLM what this NPC would naturally do next.
+        env_context contains available_pages, available_shares, etc.
+        derived from the SnapshotSpec so the LLM picks real targets.
+        """
+        recent = [
+            f"{a.get('action','?')}: {a.get('detail','')}"
+            for a in self._actions[-5:]
+        ]
+        try:
+            user_payload = (
+                "Pick this employee's next work action in valid JSON.\n\n"
+                + json.dumps({
+                    "persona": {
+                        "name": persona.name,
+                        "role": persona.role,
+                        "department": persona.department,
+                    },
+                    "available_resources": env_context,
+                    "recent_actions": recent,
+                })
+            )
+            kwargs: dict[str, Any] = {
+                "model": self.model,
+                "messages": [
+                    {"role": "system", "content": NPC_ROUTINE_PROMPT},
+                    {"role": "user", "content": user_payload},
+                ],
+                "response_format": {"type": "json_object"},
+            }
+            if self.temperature is not None:
+                kwargs["temperature"] = self.temperature
+            response = await litellm.acompletion(**kwargs)
+            return json.loads(response.choices[0].message.content)
+        except Exception as exc:
+            logger.debug("NPC %s routine LLM failed: %s", persona.name, exc)
+            return _fallback_action(persona, env_context)
+    # ------------------------------------------------------------------
+    # Main loop
+    # ------------------------------------------------------------------
     async def run_loop(
         self,
         persona: NPCPersona,
         containers: ContainerSet,
+        snapshot: SnapshotSpec,
     ) -> None:
+        """Run the NPC's autonomous workday.
+        Each cycle:
+        1. Pick and execute a routine work action
+        2. Check mailbox for incoming stimuli (phishing)
+        3. React to any stimuli found
         """
+        from open_range.builder.npc.actions import NPCActionExecutor
+        executor = NPCActionExecutor(containers, snapshot)
+        # Build environment context once from snapshot
+        env_context = {
+            "pages": executor._pages,
+            "shares": executor._shares,
+            "db_tables": executor._db_tables,
+            "colleagues": executor._users,
+        }
+        email_acct = persona.accounts.get("email", "")
+        mail_user = (
+            email_acct.split("@")[0]
+            if "@" in email_acct
+            else persona.name.lower().split()[0]
+        )
+        base_interval = persona.routine.get("action_interval_min", 2)
+        interval_s = base_interval * 60
         logger.info(
+            "NPC %s (%s) starting workday (every %dm, %d pages, %d shares)",
+            persona.name, persona.role, base_interval,
+            len(env_context["pages"]), len(env_context["shares"]),
         )
         while True:
             try:
+                # --- Phase 1: Routine work action ---
+                routine = await self.next_routine_action(persona, env_context)
+                log_entry = await executor.execute_routine(
+                    persona,
+                    routine.get("action", "idle"),
+                    routine.get("target", ""),
+                    routine.get("detail", ""),
+                    routine.get("email_body", ""),
+                )
+                self._actions.append(log_entry)
+                logger.debug("NPC %s: %s", persona.name, log_entry.get("detail", ""))
+                # --- Phase 2: Check mailbox ---
+                try:
+                    mail_output = await containers.exec(
+                        "mail",
+                        f"find /var/mail/{mail_user} "
+                        f"-newer /tmp/.npc_check_{mail_user} "
+                        f"-type f 2>/dev/null | head -1",
+                    )
+                    await containers.exec("mail", f"touch /tmp/.npc_check_{mail_user}")
+                    if mail_output and mail_output.strip():
+                        email_file = mail_output.strip().split("\n")[0]
+                        content = await containers.exec(
+                            "mail", f"head -50 '{email_file}' 2>/dev/null || true",
+                        )
+                        if content and content.strip():
+                            stimulus = Stimulus(
+                                type="email", sender="unknown",
+                                subject="Incoming message",
+                                content=content[:500],
+                            )
+                            react = await self.decide(persona, stimulus)
+                            react_log = await executor.execute(persona, react)
+                            react_log["stimulus_type"] = "email"
+                            react_log["reactive"] = True
+                            self._actions.append(react_log)
+                except Exception as mail_exc:
+                    logger.debug("NPC %s mail check: %s", persona.name, mail_exc)
+                # --- Sleep with jitter ---
+                await asyncio.sleep(interval_s * random.uniform(0.7, 1.3))
             except asyncio.CancelledError:
+                logger.info("NPC %s workday ended", persona.name)
                 break
             except Exception as exc:
                 logger.warning("NPC %s loop error: %s", persona.name, exc)
+                await asyncio.sleep(30)
+# ---------------------------------------------------------------------------
+# Fallback routine (no LLM, picks from snapshot-derived resources)
+# ---------------------------------------------------------------------------
+def _fallback_action(persona: NPCPersona, env: dict[str, Any]) -> dict[str, str]:
+    """Pick a routine action without LLM, using available resources."""
+    pages = env.get("pages", ["/"])
+    shares = env.get("shares", ["general"])
+    colleagues = env.get("colleagues", [])
+    actions = [
+        {"action": "browse", "target": random.choice(pages) if pages else "/", "detail": "Checking portal"},
+        {"action": "browse", "target": random.choice(pages) if pages else "/", "detail": "Reviewing page"},
+        {"action": "idle", "target": "", "detail": "Reading documents at desk"},
+    ]
+    if shares:
+        actions.append({"action": "access_share", "target": random.choice(shares), "detail": "Checking files"})
+    if colleagues:
+        actions.append({"action": "send_email", "target": random.choice(colleagues), "detail": "Status update", "email_body": "Quick check-in on today's items."})
+    return random.choice(actions)
+# ---------------------------------------------------------------------------
+# Simpler behavior classes (Level 0, no LLM)
+# ---------------------------------------------------------------------------
 class NullNPCBehavior:
+    """No-op NPC behavior for Level 0."""
+    async def decide(self, persona: NPCPersona, stimulus: Stimulus) -> NPCAction:
         return NPCAction(action="ignore")
 class RuleBasedNPCBehavior:
+    """Heuristic NPC decisions based on susceptibility scores."""
+    async def decide(self, persona: NPCPersona, stimulus: Stimulus) -> NPCAction:
         susceptibility = persona.susceptibility.get(
+            stimulus.type, persona.susceptibility.get("phishing_email", 0.5)
         )
         score = stimulus.plausibility * susceptibility
         if persona.security_awareness > 0.7 and score < 0.8:
+            return NPCAction(action="report_to_IT", side_effects=["reported suspicious email to IT"])
         elif score > 0.6:
+            return NPCAction(action="click_link", side_effects=["clicked link in email"])
         elif score > 0.3:
             return NPCAction(action="ignore")
         else:
+            return NPCAction(action="report_to_IT", side_effects=["forwarded to security team"])

src/open_range/cli.py CHANGED Viewed

@@ -105,6 +105,23 @@ def _write_snapshot(spec: "SnapshotSpec", output_dir: Path) -> Path:
     return dest
 # ---------------------------------------------------------------------------
 # CLI group
 # ---------------------------------------------------------------------------
@@ -185,6 +202,127 @@ def build(
     click.echo(f"  Elapsed: {elapsed:.1f}s")
 # ---------------------------------------------------------------------------
 # render
 # ---------------------------------------------------------------------------

     return dest
+def _parse_roles(raw: str) -> tuple[str, ...]:
+    """Parse a comma-separated role list."""
+    roles = tuple(dict.fromkeys(part.strip().lower() for part in raw.split(",") if part.strip()))
+    valid = {"red", "blue"}
+    invalid = [role for role in roles if role not in valid]
+    if invalid:
+        click.echo(
+            f"Error: invalid roles: {', '.join(invalid)}. Expected comma-separated values from: red, blue.",
+            err=True,
+        )
+        sys.exit(1)
+    if not roles:
+        click.echo("Error: at least one role must be selected.", err=True)
+        sys.exit(1)
+    return roles
 # ---------------------------------------------------------------------------
 # CLI group
 # ---------------------------------------------------------------------------
     click.echo(f"  Elapsed: {elapsed:.1f}s")
+# ---------------------------------------------------------------------------
+# synthetic-data
+# ---------------------------------------------------------------------------
+@cli.command("synthetic-data")
+@click.option("-o", "--output", required=True, type=click.Path(), help="Output JSONL path for synthetic trajectories.")
+@click.option("-m", "--manifest", default=None, type=click.Path(exists=True), help="Path to manifest YAML.")
+@click.option("-s", "--snapshot", default=None, type=click.Path(exists=True), help="Path to snapshot JSON.")
+@click.option("--num-traces", default=10, type=click.IntRange(1), help="Number of synthetic episodes to generate.")
+@click.option("--seed", default=None, type=int, help="Base random seed for reproducibility.")
+@click.option("--tier", default=1, type=click.IntRange(1, 5), help="Tier level 1-5 when building from a manifest.")
+@click.option("--max-steps", default=12, type=click.IntRange(1), help="Maximum red/blue turns per episode.")
+@click.option("--roles", default="red", help="Comma-separated teacher/export roles: red, blue.")
+@click.option("--reward-threshold", default=0.0, type=float, help="Minimum total role reward required for export.")
+@click.option("--teacher-model", default=None, help="LiteLLM teacher model. If omitted, selected roles use scripted agents.")
+@click.option("--red-model", default=None, help="Override model for Red teacher.")
+@click.option("--blue-model", default=None, help="Override model for Blue teacher.")
+@click.option("--temperature", default=0.2, type=float, help="Teacher sampling temperature.")
+@click.option("--max-tokens", default=512, type=int, help="Maximum completion tokens per teacher action.")
+@click.option("--template-only/--llm-builder", default=True, help="When using --manifest, build snapshots deterministically instead of via LLM.")
+@click.option("--builder-model", default=None, help="LLM builder model when using --llm-builder.")
+@click.option("--randomize-flags/--static-flags", default=True, help="Randomize flag values per synthetic episode.")
+def synthetic_data(
+    output: str,
+    manifest: str | None,
+    snapshot: str | None,
+    num_traces: int,
+    seed: int | None,
+    tier: int,
+    max_steps: int,
+    roles: str,
+    reward_threshold: float,
+    teacher_model: str | None,
+    red_model: str | None,
+    blue_model: str | None,
+    temperature: float,
+    max_tokens: int,
+    template_only: bool,
+    builder_model: str | None,
+    randomize_flags: bool,
+) -> None:
+    """Generate snapshot-grounded synthetic SFT trajectories."""
+    from open_range.training.synthetic import (
+        SyntheticTraceGenerator,
+        build_teacher_agents,
+    )
+    if bool(manifest) == bool(snapshot):
+        click.echo("Error: provide exactly one of --manifest or --snapshot.", err=True)
+        sys.exit(1)
+    selected_roles = _parse_roles(roles)
+    resolved_teacher_model = (
+        teacher_model
+        or os.environ.get("OPENRANGE_SYNTH_MODEL")
+    )
+    red_agent, blue_agent = build_teacher_agents(
+        teacher_model=resolved_teacher_model,
+        roles=selected_roles,
+        red_model=red_model,
+        blue_model=blue_model,
+        temperature=temperature,
+        max_tokens=max_tokens,
+    )
+    if snapshot:
+        source_label = f"snapshot={snapshot}"
+        generator = SyntheticTraceGenerator(
+            snapshot=_load_snapshot(snapshot),
+            red_agent=red_agent,
+            blue_agent=blue_agent,
+            tier=tier,
+            max_steps=max_steps,
+            randomize_flags=randomize_flags,
+        )
+    else:
+        source_label = f"manifest={manifest}"
+        generator = SyntheticTraceGenerator.from_manifest(
+            _load_manifest(str(manifest)),
+            red_agent=red_agent,
+            blue_agent=blue_agent,
+            template_only=template_only,
+            builder_model=builder_model,
+            tier=tier,
+            max_steps=max_steps,
+            randomize_flags=randomize_flags,
+        )
+    teacher_roles = []
+    if selected_roles:
+        if red_model or resolved_teacher_model:
+            if "red" in selected_roles:
+                teacher_roles.append("red")
+        if blue_model or resolved_teacher_model:
+            if "blue" in selected_roles:
+                teacher_roles.append("blue")
+    click.echo(f"Generating synthetic traces from {source_label} ...")
+    click.echo(f"  Roles: {', '.join(selected_roles)}")
+    click.echo(
+        "  Teacher roles: "
+        + (", ".join(teacher_roles) if teacher_roles else "none (scripted fallbacks)")
+    )
+    try:
+        logger, count = generator.export_jsonl(
+            output,
+            num_traces=num_traces,
+            seed=seed,
+            reward_threshold=reward_threshold,
+            roles=selected_roles,
+        )
+    except Exception as exc:
+        click.echo(f"Error: synthetic data generation failed: {exc}", err=True)
+        sys.exit(1)
+    click.echo(f"Wrote {count} JSONL records to {output}")
+    click.echo(f"  Episodes: {len(logger.episodes)}")
+    click.echo(f"  Randomized flags: {'yes' if randomize_flags else 'no'}")
 # ---------------------------------------------------------------------------
 # render
 # ---------------------------------------------------------------------------

src/open_range/training/__init__.py CHANGED Viewed

	@@ -0,0 +1,15 @@

+"""Training utilities for OpenRange."""
+from open_range.training.synthetic import (
+    SyntheticRangeEnvironment,
+    SyntheticTraceGenerator,
+    build_teacher_agents,
+    randomize_snapshot_flags,
+)
+__all__ = [
+    "SyntheticRangeEnvironment",
+    "SyntheticTraceGenerator",
+    "build_teacher_agents",
+    "randomize_snapshot_flags",
+]

src/open_range/training/synthetic.py ADDED Viewed

	@@ -0,0 +1,717 @@

+"""Synthetic trajectory generation for OpenRange.
+This module provides a fast, snapshot-backed simulator for collecting
+teacher-model trajectories without booting Docker containers. It is meant
+for SFT warm-start data generation, not reward-faithful evaluation.
+"""
+from __future__ import annotations
+import asyncio
+import logging
+import random
+import re
+import shlex
+from pathlib import Path
+from typing import Any
+from open_range.agents.llm_agent import LLMRangeAgent
+from open_range.agents.protocol import RangeAgent
+from open_range.agents.scripted_agent import ScriptedBlueAgent, ScriptedRedAgent
+from open_range.builder.builder import LLMSnapshotBuilder, TemplateOnlyBuilder
+from open_range.protocols import BuildContext, SnapshotBuilder, SnapshotSpec, Vulnerability
+from open_range.server.environment import RangeEnvironment
+from open_range.server.models import RangeAction, RangeObservation
+from open_range.training.trajectory import TrajectoryLogger
+logger = logging.getLogger(__name__)
+_TOKEN_RE = re.compile(r"[a-z0-9_./:-]+")
+def _run_async(coro: Any) -> Any:
+    """Run an async coroutine from synchronous code."""
+    try:
+        loop = asyncio.get_running_loop()
+    except RuntimeError:
+        loop = None
+    if loop and loop.is_running():
+        import concurrent.futures
+        with concurrent.futures.ThreadPoolExecutor() as pool:
+            return pool.submit(asyncio.run, coro).result()
+    return asyncio.run(coro)
+def _iter_hosts(snapshot: SnapshotSpec) -> list[str]:
+    raw_hosts = snapshot.topology.get("hosts", [])
+    hosts: list[str] = []
+    for host in raw_hosts:
+        if isinstance(host, dict):
+            name = str(host.get("name", "")).strip()
+        else:
+            name = str(host).strip()
+        if name:
+            hosts.append(name)
+    return hosts
+def _deep_replace(value: Any, replacements: dict[str, str]) -> Any:
+    if isinstance(value, str):
+        result = value
+        for old, new in replacements.items():
+            result = result.replace(old, new)
+        return result
+    if isinstance(value, list):
+        return [_deep_replace(item, replacements) for item in value]
+    if isinstance(value, dict):
+        return {key: _deep_replace(item, replacements) for key, item in value.items()}
+    return value
+def randomize_snapshot_flags(snapshot: SnapshotSpec, seed: int | None = None) -> SnapshotSpec:
+    """Clone *snapshot* with unique flag values substituted throughout."""
+    if not snapshot.flags:
+        return snapshot.model_copy(deep=True)
+    rng = random.Random(seed)
+    replacements: dict[str, str] = {}
+    for flag in snapshot.flags:
+        inner = "".join(rng.choice("abcdef0123456789") for _ in range(16))
+        replacements[flag.value] = f"FLAG{{{inner}}}"
+    payload = snapshot.model_dump(mode="python")
+    payload = _deep_replace(payload, replacements)
+    return SnapshotSpec.model_validate(payload)
+def _observation_text(observation: str | RangeObservation) -> str:
+    """Convert an observation into training text without reward leakage."""
+    if isinstance(observation, str):
+        return observation
+    parts: list[str] = []
+    if observation.stdout:
+        parts.append(observation.stdout)
+    if observation.stderr:
+        parts.append(f"STDERR:\n{observation.stderr}")
+    if observation.alerts:
+        parts.append("ALERTS:\n" + "\n".join(f"- {alert}" for alert in observation.alerts))
+    if observation.flags_captured:
+        parts.append(
+            "FLAGS CAPTURED:\n"
+            + "\n".join(f"- {flag}" for flag in observation.flags_captured)
+        )
+    return "\n\n".join(parts)
+class SyntheticRangeEnvironment(RangeEnvironment):
+    """Fast, deterministic simulator built from a ``SnapshotSpec``."""
+    def __init__(
+        self,
+        *,
+        randomize_flags: bool = True,
+        max_steps: int = 30,
+    ) -> None:
+        super().__init__(docker_available=False, max_steps=max_steps)
+        self._randomize_flags = randomize_flags
+        self._synthetic_seed: int | None = None
+        self._ephemeral_files: dict[str, str] = {}
+    def reset(
+        self,
+        seed: int | None = None,
+        episode_id: str | None = None,
+        **kwargs: Any,
+    ) -> RangeObservation:
+        self._synthetic_seed = seed
+        self._ephemeral_files = {}
+        return super().reset(seed=seed, episode_id=episode_id, **kwargs)
+    def _select_snapshot(self, **kwargs: Any) -> SnapshotSpec:
+        snapshot = super()._select_snapshot(**kwargs)
+        if not self._randomize_flags:
+            return snapshot.model_copy(deep=True)
+        return randomize_snapshot_flags(snapshot, seed=self._synthetic_seed)
+    def _exec_in_container(
+        self,
+        container_name: str,
+        command: str,
+        timeout_s: float | None = None,
+    ) -> tuple[str, str]:
+        del container_name, timeout_s  # unused in the synthetic executor
+        if self._snapshot is None:
+            return "", "No snapshot loaded"
+        if self._state.mode == "blue":
+            return self._simulate_blue_command(command)
+        return self._simulate_red_command(command)
+    def _simulate_red_command(self, command: str) -> tuple[str, str]:
+        normalized = command.strip().lower()
+        if not normalized:
+            return "", "Empty command"
+        exact_step = self._match_golden_step(command)
+        if exact_step is not None:
+            return self._render_golden_output(command, exact_step), ""
+        if normalized == "whoami":
+            return "kali\n", ""
+        if normalized == "pwd":
+            return "/root\n", ""
+        if normalized.startswith("ls"):
+            return self._render_ls(command), ""
+        if normalized.startswith("cat "):
+            return self._render_cat(command)
+        if "nmap" in normalized:
+            return self._render_nmap(command), ""
+        if "curl" in normalized:
+            return self._render_curl(command), ""
+        if "mysql" in normalized:
+            return self._render_mysql(command), ""
+        if "smbclient" in normalized:
+            return self._render_smb(command), ""
+        if "ldapsearch" in normalized:
+            return self._render_ldap(command), ""
+        if re.search(r"\bssh\b|\bsshpass\b", normalized):
+            return self._render_ssh(command), ""
+        if "grep" in normalized and "flag" in normalized:
+            return self._render_flag_search(), ""
+        return "Command completed successfully.\n", ""
+    def _simulate_blue_command(self, command: str) -> tuple[str, str]:
+        normalized = command.strip().lower()
+        if not normalized:
+            return "", "Empty command"
+        if any(token in normalized for token in ("grep", "tail", "cat", "awk", "sed")):
+            return self._render_siem_query(command), ""
+        if "check_services" in normalized:
+            return self._render_service_status(), ""
+        if "iptables" in normalized or "ufw" in normalized or "firewall" in normalized:
+            return "Firewall policy updated.\n", ""
+        if normalized.startswith("patch "):
+            return "Patch applied in synthetic environment.\n", ""
+        if "restart" in normalized:
+            return "Service restarted.\n", ""
+        return "Investigation command completed.\n", ""
+    def _match_golden_step(self, command: str):
+        if self._snapshot is None:
+            return None
+        normalized = self._normalize_command(command)
+        best_step = None
+        best_score = 0.0
+        cmd_name = self._command_name(command)
+        for step in self._snapshot.golden_path:
+            step_normalized = self._normalize_command(step.command)
+            if normalized == step_normalized:
+                return step
+            if cmd_name != self._command_name(step.command):
+                continue
+            score = self._token_overlap(normalized, step_normalized)
+            if score > best_score:
+                best_score = score
+                best_step = step
+        if best_score >= 0.66:
+            return best_step
+        return None
+    @staticmethod
+    def _command_name(command: str) -> str:
+        stripped = command.strip()
+        if not stripped:
+            return ""
+        return stripped.split()[0].rsplit("/", 1)[-1].lower()
+    @staticmethod
+    def _normalize_command(command: str) -> str:
+        lowered = command.lower()
+        return " ".join(_TOKEN_RE.findall(lowered))
+    @staticmethod
+    def _token_overlap(left: str, right: str) -> float:
+        left_tokens = set(left.split())
+        right_tokens = set(right.split())
+        if not left_tokens or not right_tokens:
+            return 0.0
+        intersection = left_tokens & right_tokens
+        union = left_tokens | right_tokens
+        return len(intersection) / len(union)
+    def _render_golden_output(self, command: str, step: Any) -> str:
+        expected = step.expect_in_stdout or "Command completed."
+        lowered = command.lower()
+        if "nmap" in lowered:
+            return f"Starting Nmap 7.94\n{expected}\nNmap done.\n"
+        if "curl" in lowered and "search" in lowered and ("union" in lowered or "flag" in lowered):
+            return f"Search results:\n{expected}\n"
+        if "curl" in lowered:
+            return f"{expected}\n"
+        if "mysql" in lowered:
+            return f"{expected}\n"
+        return f"{expected}\n"
+    def _render_nmap(self, command: str) -> str:
+        lines = ["Starting Nmap 7.94"]
+        lowered = command.lower()
+        if "10.0.1" in lowered or "web" in lowered:
+            lines.extend(
+                [
+                    "80/tcp open http nginx 1.24",
+                    "25/tcp open smtp postfix",
+                ]
+            )
+        if "10.0.2" in lowered or "db" in lowered:
+            lines.extend(
+                [
+                    "3306/tcp open mysql MySQL 8.0",
+                    "445/tcp open smb samba 4.17",
+                ]
+            )
+        if "10.0.3" in lowered or "ldap" in lowered or "siem" in lowered:
+            lines.extend(
+                [
+                    "389/tcp open ldap OpenLDAP 2.6",
+                    "514/tcp open syslog rsyslogd",
+                ]
+            )
+        if len(lines) == 1:
+            lines.append("Host seems down.")
+        lines.append("Nmap done.")
+        return "\n".join(lines) + "\n"
+    def _render_curl(self, command: str) -> str:
+        lowered = command.lower()
+        flag = self._flag_value()
+        if "http://web/" in lowered and "login" not in lowered and "search" not in lowered:
+            hints = ["Welcome to the customer portal."]
+            if self._has_vuln_type("sqli"):
+                hints.append("Try /search?q=test")
+            if self._has_vuln_type("idor"):
+                hints.append("API available at /api/users/1/profile")
+            if self._has_vuln_type("path_traversal"):
+                hints.append("Downloads available at /download?file=report.pdf")
+            return "\n".join(hints) + "\n"
+        if "/login" in lowered:
+            return "Login\n<form action=\"/login\">\n"
+        if "search" in lowered:
+            if any(token in lowered for token in ("union", "select", "or+1=1", "or 1=1", "%27", "'")):
+                return f"Search results:\n{flag}\n"
+            return "products\nmonitor\nlaptop\nrouter\n"
+        if "/api/users/" in lowered and self._has_vuln_type("idor"):
+            if "/1/" in lowered:
+                return '{"id":1,"username":"admin","role":"admin"}\n'
+            return '{"id":2,"username":"svc_backup","password":"backup123"}\n'
+        if "download?file=" in lowered and self._has_vuln_type("path_traversal"):
+            if "passwd" in lowered:
+                return "root:x:0:0:root:/root:/bin/bash\nwww-data:x:33:33:www-data:/var/www:/usr/sbin/nologin\n"
+            if "flag" in lowered or "/var/flags" in lowered:
+                return f"{flag}\n"
+            return "Quarterly report.pdf\n"
+        if "http://mail" in lowered:
+            return "220 mail ESMTP Postfix\n"
+        return "HTTP/1.1 200 OK\n"
+    def _render_mysql(self, command: str) -> str:
+        lowered = command.lower()
+        flag = self._flag_value()
+        if "show databases" in lowered:
+            return "information_schema\nreferral_db\nflags\n"
+        if "select" in lowered and "flag" in lowered:
+            return f"{flag}\n"
+        if "show tables" in lowered:
+            return "users\nproducts\nsecrets\n"
+        return "Query OK\n"
+    def _render_smb(self, command: str) -> str:
+        lowered = command.lower()
+        if " -l " in f" {lowered} ":
+            return "Sharename       Type      Comment\ngeneral         Disk      General share\ncontracts       Disk      Contracts\n"
+        if "get " in lowered:
+            creds = "svc_backup:backup123\n"
+            target = self._extract_copy_destination(command) or "/tmp/creds.txt"
+            self._ephemeral_files[target] = creds
+            return f"getting file to {target}\n"
+        return "  .\n  ..\n  admin_creds.txt\n"
+    def _render_ldap(self, command: str) -> str:
+        del command
+        return (
+            "dn: uid=alice,dc=corp,dc=local\nuid: alice\n\n"
+            "dn: uid=bob,dc=corp,dc=local\nuid: bob\n"
+        )
+    def _render_ssh(self, command: str) -> str:
+        lowered = command.lower()
+        flag = self._flag_value()
+        if "cat /var/flags" in lowered or "grep flag" in lowered:
+            return f"{flag}\n"
+        return "uid=1001(svc_webapp) gid=1001(svc_webapp)\n"
+    def _render_flag_search(self) -> str:
+        return f"{self._flag_value()}\n"
+    def _render_cat(self, command: str) -> tuple[str, str]:
+        path = self._extract_first_path(command)
+        if not path:
+            return "", "cat: missing operand"
+        if path in self._ephemeral_files:
+            return self._ephemeral_files[path], ""
+        for flag in self._snapshot.flags if self._snapshot else []:
+            if path == flag.path or path.endswith(Path(flag.path).name):
+                return f"{flag.value}\n", ""
+        if self._snapshot and path in self._snapshot.files:
+            return self._snapshot.files[path], ""
+        if path.endswith("config.php"):
+            return (
+                "<?php\n"
+                "$DB_HOST='db';\n$DB_USER='app_user';\n$DB_PASS='AppUs3r!2024';\n"
+                "?>\n",
+                "",
+            )
+        if path.endswith("/etc/passwd") or path == "/etc/passwd":
+            return "root:x:0:0:root:/root:/bin/bash\nwww-data:x:33:33:www-data:/var/www:/usr/sbin/nologin\n", ""
+        return "", f"cat: {path}: No such file or directory"
+    def _render_ls(self, command: str) -> str:
+        path = self._extract_first_path(command) or "."
+        if path in (".", "/root"):
+            entries = ["notes.txt"]
+            entries.extend(sorted(Path(p).name for p in self._ephemeral_files))
+            return "\n".join(sorted(set(entries))) + "\n"
+        if path == "/var/log/siem":
+            return "consolidated\nalerts.log\nweb_access.log\n"
+        if self._snapshot and self._snapshot.files:
+            matching = [
+                Path(file_path).name
+                for file_path in self._snapshot.files
+                if str(Path(file_path).parent) == path
+            ]
+            if matching:
+                return "\n".join(sorted(set(matching))) + "\n"
+        return ".\n..\n"
+    def _render_siem_query(self, command: str) -> str:
+        lines = self._siem_lines()
+        if not lines:
+            return "No alerts matched.\n"
+        lowered = command.lower()
+        filtered = lines
+        if "union" in lowered or "select" in lowered:
+            filtered = [line for line in lines if "SQLi" in line or "UNION" in line]
+        elif "nmap" in lowered or "scan" in lowered:
+            filtered = [line for line in lines if "scan" in line.lower()]
+        elif "ldap" in lowered:
+            filtered = [line for line in lines if "LDAP" in line]
+        if not filtered:
+            filtered = lines
+        return "\n".join(filtered[-20:]) + "\n"
+    def _render_service_status(self) -> str:
+        hosts = _iter_hosts(self._snapshot) if self._snapshot else []
+        if not hosts:
+            return "No services registered.\n"
+        return "\n".join(f"{host}: healthy" for host in hosts if host != "attacker") + "\n"
+    def _siem_lines(self) -> list[str]:
+        lines: list[str] = []
+        for record in self._red_history:
+            command = str(record.get("command", ""))
+            lowered = command.lower()
+            if "nmap" in lowered:
+                lines.append(
+                    f"[ALERT] Port scan detected from attacker host at step {record['step']}: {command}"
+                )
+            if "union" in lowered or ("select" in lowered and "search" in lowered):
+                lines.append(
+                    f"[ALERT] SQLi pattern detected in web access log at step {record['step']}: {command}"
+                )
+            if "curl http://web/" in lowered:
+                lines.append(
+                    f"[INFO] Web portal accessed from attacker at step {record['step']}"
+                )
+            if "ldapsearch" in lowered:
+                lines.append(
+                    f"[ALERT] LDAP enumeration detected at step {record['step']}: {command}"
+                )
+        if not lines:
+            lines.append("[INFO] No suspicious activity observed.")
+        return lines
+    def _has_vuln_type(self, vuln_type: str) -> bool:
+        if self._snapshot is None:
+            return False
+        return any(v.type == vuln_type for v in self._snapshot.truth_graph.vulns)
+    def _flag_value(self) -> str:
+        if self._snapshot and self._snapshot.flags:
+            return self._snapshot.flags[0].value
+        return "FLAG{synthetic_missing_flag}"
+    @staticmethod
+    def _extract_copy_destination(command: str) -> str | None:
+        try:
+            parts = shlex.split(command)
+        except ValueError:
+            return None
+        if len(parts) >= 2:
+            candidate = parts[-1]
+            if candidate.startswith("/"):
+                return candidate
+        return None
+    @staticmethod
+    def _extract_first_path(command: str) -> str | None:
+        try:
+            parts = shlex.split(command)
+        except ValueError:
+            return None
+        for token in parts[1:]:
+            if token.startswith("/"):
+                return token
+            if "/" in token and not token.startswith("http"):
+                return token
+        return None
+class SyntheticTraceGenerator:
+    """Generate OpenRange training traces from a simulated snapshot source."""
+    def __init__(
+        self,
+        *,
+        snapshot: SnapshotSpec | None = None,
+        manifest: dict[str, Any] | None = None,
+        builder: SnapshotBuilder | None = None,
+        red_agent: RangeAgent | None = None,
+        blue_agent: RangeAgent | None = None,
+        tier: int = 1,
+        max_steps: int = 30,
+        randomize_flags: bool = True,
+    ) -> None:
+        if snapshot is None and manifest is None:
+            raise ValueError("SyntheticTraceGenerator requires a snapshot or manifest")
+        self._snapshot = snapshot.model_copy(deep=True) if snapshot is not None else None
+        self._manifest = manifest
+        self._builder = builder
+        self._tier = tier
+        self._max_steps = max_steps
+        self._randomize_flags = randomize_flags
+        self.red_agent = red_agent or ScriptedRedAgent()
+        self.blue_agent = blue_agent or ScriptedBlueAgent()
+    @classmethod
+    def from_manifest(
+        cls,
+        manifest: dict[str, Any],
+        *,
+        red_agent: RangeAgent | None = None,
+        blue_agent: RangeAgent | None = None,
+        builder: SnapshotBuilder | None = None,
+        template_only: bool = True,
+        builder_model: str | None = None,
+        tier: int = 1,
+        max_steps: int = 30,
+        randomize_flags: bool = True,
+    ) -> "SyntheticTraceGenerator":
+        resolved_builder = builder
+        if resolved_builder is None:
+            if template_only:
+                resolved_builder = TemplateOnlyBuilder()
+            else:
+                resolved_builder = LLMSnapshotBuilder(
+                    model=builder_model or "azure/gpt-5.2-codex"
+                )
+        return cls(
+            manifest=manifest,
+            builder=resolved_builder,
+            red_agent=red_agent,
+            blue_agent=blue_agent,
+            tier=tier,
+            max_steps=max_steps,
+            randomize_flags=randomize_flags,
+        )
+    def generate(
+        self,
+        *,
+        num_traces: int = 10,
+        seed: int | None = None,
+    ) -> TrajectoryLogger:
+        logger = TrajectoryLogger()
+        for index in range(num_traces):
+            episode_seed = None if seed is None else seed + index
+            snapshot = self._materialize_snapshot(episode_seed)
+            self._run_episode(
+                snapshot=snapshot,
+                logger=logger,
+                episode_index=index,
+                seed=episode_seed,
+            )
+        return logger
+    def export_jsonl(
+        self,
+        path: str | Path,
+        *,
+        num_traces: int = 10,
+        seed: int | None = None,
+        reward_threshold: float = 0.0,
+        roles: tuple[str, ...] = ("red", "blue"),
+    ) -> tuple[TrajectoryLogger, int]:
+        logger = self.generate(num_traces=num_traces, seed=seed)
+        count = logger.export_jsonl(path, reward_threshold=reward_threshold, roles=roles)
+        return logger, count
+    def _materialize_snapshot(self, seed: int | None) -> SnapshotSpec:
+        if self._snapshot is not None:
+            return self._snapshot.model_copy(deep=True)
+        if self._manifest is None or self._builder is None:
+            raise RuntimeError("Synthetic trace generator is missing its manifest builder")
+        context = BuildContext(seed=seed, tier=self._tier)
+        snapshot = _run_async(self._builder.build(self._manifest, context))
+        return snapshot
+    def _run_episode(
+        self,
+        *,
+        snapshot: SnapshotSpec,
+        logger: TrajectoryLogger,
+        episode_index: int,
+        seed: int | None,
+    ) -> None:
+        env = SyntheticRangeEnvironment(
+            randomize_flags=self._randomize_flags,
+            max_steps=self._max_steps,
+        )
+        try:
+            env.reset(
+                snapshot=snapshot,
+                episode_id=f"synth-{episode_index:04d}",
+                seed=seed,
+            )
+            active_snapshot = env.snapshot
+            if active_snapshot is None:
+                raise RuntimeError("Synthetic environment failed to load a snapshot")
+            task = active_snapshot.task
+            red_briefing = getattr(task, "red_briefing", "") or "Begin the assessment."
+            blue_briefing = getattr(task, "blue_briefing", "") or "Monitor the range."
+            self.red_agent.reset(briefing=red_briefing, role="red")
+            self.blue_agent.reset(briefing=blue_briefing, role="blue")
+            snapshot_id = active_snapshot.topology.get("snapshot_id", f"synth-{episode_index:04d}")
+            logger.start_episode(
+                episode_id=f"synth-{episode_index:04d}",
+                snapshot_id=snapshot_id,
+                tier=env.state.tier,
+            )
+            current_red_observation: str | RangeObservation = red_briefing
+            current_blue_observation: str | RangeObservation = blue_briefing
+            step = 0
+            done = False
+            last_obs: RangeObservation = RangeObservation(stdout=red_briefing)
+            while step < self._max_steps and not done:
+                red_cmd = self.red_agent.act(current_red_observation)
+                red_view = _observation_text(current_red_observation)
+                red_obs = env.step(RangeAction(command=red_cmd, mode="red"))
+                logger.log_turn(
+                    role="red",
+                    observation=red_view,
+                    action=red_cmd,
+                    reward=float(red_obs.reward or 0.0),
+                )
+                step += 1
+                last_obs = red_obs
+                done = bool(red_obs.done)
+                current_blue_observation = red_obs
+                if done or step >= self._max_steps:
+                    break
+                blue_cmd = self.blue_agent.act(current_blue_observation)
+                blue_view = _observation_text(current_blue_observation)
+                blue_obs = env.step(RangeAction(command=blue_cmd, mode="blue"))
+                logger.log_turn(
+                    role="blue",
+                    observation=blue_view,
+                    action=blue_cmd,
+                    reward=float(blue_obs.reward or 0.0),
+                )
+                step += 1
+                last_obs = blue_obs
+                done = bool(blue_obs.done)
+                current_red_observation = blue_obs
+            state = env.state
+            outcome = self._episode_outcome(env)
+            logger.end_episode(
+                outcome=outcome,
+                metrics={
+                    "steps": state.step_count,
+                    "flags_found": len(state.flags_found),
+                    "red_actions": len(env.red_history),
+                    "blue_actions": len(env.blue_history),
+                    "done": bool(last_obs.done),
+                },
+            )
+        finally:
+            env.close()
+    @staticmethod
+    def _episode_outcome(env: SyntheticRangeEnvironment) -> str:
+        if env.state.flags_found:
+            return "flag_captured"
+        if any(
+            record.get("type") == "finding" or record.get("cmd_name") == "submit_finding"
+            for record in env.blue_history
+        ):
+            return "blue_defended"
+        return "timeout"
+def build_teacher_agents(
+    *,
+    teacher_model: str | None = None,
+    roles: tuple[str, ...] = ("red",),
+    red_model: str | None = None,
+    blue_model: str | None = None,
+    temperature: float | None = 0.2,
+    max_tokens: int = 512,
+    **litellm_kwargs: Any,
+) -> tuple[RangeAgent, RangeAgent]:
+    """Construct teacher agents for the selected roles, scripted fallbacks otherwise."""
+    if "red" in roles and (red_model or teacher_model):
+        red_agent: RangeAgent = LLMRangeAgent(
+            model=red_model or str(teacher_model),
+            temperature=temperature,
+            max_tokens=max_tokens,
+            **litellm_kwargs,
+        )
+    else:
+        red_agent = ScriptedRedAgent()
+    if "blue" in roles and (blue_model or teacher_model):
+        blue_agent: RangeAgent = LLMRangeAgent(
+            model=blue_model or str(teacher_model),
+            temperature=temperature,
+            max_tokens=max_tokens,
+            **litellm_kwargs,
+        )
+    else:
+        blue_agent = ScriptedBlueAgent()
+    return red_agent, blue_agent

src/open_range/training/trajectory.py CHANGED Viewed

@@ -130,7 +130,18 @@ class Episode:
     def to_jsonl_record(self, role: str) -> dict[str, Any]:
         """Build a single JSONL record for the given role."""
         reward = self.total_red_reward if role == "red" else self.total_blue_reward
-        return {
             "episode_id": self.episode_id,
             "snapshot_id": self.snapshot_id,
             "tier": self.tier,
@@ -138,7 +149,15 @@ class Episode:
             "messages": self.to_chat_messages(role),
             "reward": round(reward, 4),
             "outcome": self.outcome,
         }
 # ---------------------------------------------------------------------------
@@ -285,28 +304,35 @@ class TrajectoryLogger:
         Returns:
             Number of JSONL lines written.
         """
         path = Path(path)
         path.parent.mkdir(parents=True, exist_ok=True)
-        count = 0
         with open(path, "w") as f:
-            for episode in self._episodes:
-                for role in roles:
-                    # Check if this role had any turns
-                    role_turns = [t for t in episode.turns if t.role == role]
-                    if not role_turns:
-                        continue
-                    # Filter by reward threshold
-                    total_reward = sum(t.reward for t in role_turns)
-                    if total_reward < reward_threshold:
-                        continue
-                    record = episode.to_jsonl_record(role)
-                    f.write(json.dumps(record) + "\n")
-                    count += 1
-        return count
     def clear(self) -> None:
         """Remove all recorded episodes."""

     def to_jsonl_record(self, role: str) -> dict[str, Any]:
         """Build a single JSONL record for the given role."""
         reward = self.total_red_reward if role == "red" else self.total_blue_reward
+        metadata = {
+            "source": self.metrics.get("source", "open_range.synthetic"),
+            "success": self.outcome == ("flag_captured" if role == "red" else "blue_defended"),
+            "snapshot_id": self.snapshot_id,
+            "tier": self.tier,
+            "role": role,
+        }
+        extra_metadata = self.metrics.get("metadata")
+        if isinstance(extra_metadata, dict):
+            metadata.update(extra_metadata)
+        record = {
             "episode_id": self.episode_id,
             "snapshot_id": self.snapshot_id,
             "tier": self.tier,
             "messages": self.to_chat_messages(role),
             "reward": round(reward, 4),
             "outcome": self.outcome,
+            "metadata": metadata,
         }
+        ground_truth_flags = self.metrics.get("ground_truth_flags")
+        if isinstance(ground_truth_flags, list):
+            record["ground_truth_flag"] = ground_truth_flags[0] if ground_truth_flags else None
+        else:
+            record["ground_truth_flag"] = None
+        record["optimal_steps"] = self.metrics.get("optimal_steps")
+        return record
 # ---------------------------------------------------------------------------
         Returns:
             Number of JSONL lines written.
         """
+        records = self.to_records(reward_threshold=reward_threshold, roles=roles)
         path = Path(path)
         path.parent.mkdir(parents=True, exist_ok=True)
         with open(path, "w") as f:
+            for record in records:
+                f.write(json.dumps(record) + "\n")
+        return len(records)
+    def to_records(
+        self,
+        reward_threshold: float = 0.0,
+        roles: tuple[str, ...] = ("red", "blue"),
+    ) -> list[dict[str, Any]]:
+        """Return JSONL-ready records without writing them to disk."""
+        records: list[dict[str, Any]] = []
+        for episode in self._episodes:
+            for role in roles:
+                role_turns = [t for t in episode.turns if t.role == role]
+                if not role_turns:
+                    continue
+                total_reward = sum(t.reward for t in role_turns)
+                if total_reward < reward_threshold:
+                    continue
+                records.append(episode.to_jsonl_record(role))
+        return records
     def clear(self) -> None:
         """Remove all recorded episodes."""

tests/test_synthetic.py ADDED Viewed

	@@ -0,0 +1,232 @@

+"""Tests for synthetic trajectory generation."""
+from __future__ import annotations
+import json
+import os
+import sys
+import types
+import pytest
+from click.testing import CliRunner
+from open_range.agents.llm_agent import LLMRangeAgent
+from open_range.agents.scripted_agent import ScriptedAgent, ScriptedBlueAgent, ScriptedRedAgent
+from open_range.cli import cli
+from open_range.server.models import RangeAction
+from open_range.training.synthetic import (
+    SyntheticRangeEnvironment,
+    SyntheticTraceGenerator,
+    build_teacher_agents,
+    randomize_snapshot_flags,
+)
+class TestFlagRandomization:
+    def test_randomize_snapshot_flags_rewrites_all_string_references(self, sample_snapshot_spec):
+        snapshot = sample_snapshot_spec.model_copy(deep=True)
+        original_flag = snapshot.flags[0].value
+        snapshot.files["web:/tmp/flag.txt"] = f"echo {original_flag}"
+        randomized = randomize_snapshot_flags(snapshot, seed=7)
+        replaced_flag = randomized.flags[0].value
+        assert replaced_flag != original_flag
+        dumped = json.dumps(randomized.model_dump(mode="python"))
+        assert original_flag not in dumped
+        assert replaced_flag in dumped
+class TestSyntheticEnvironment:
+    def test_synthetic_environment_simulates_red_and_blue_flow(self, sample_snapshot_spec):
+        env = SyntheticRangeEnvironment(randomize_flags=False, max_steps=5)
+        try:
+            reset_obs = env.reset(snapshot=sample_snapshot_spec, episode_id="synthetic-test")
+            assert "RED BRIEFING" in reset_obs.stdout
+            red_obs = env.step(
+                RangeAction(
+                    command="curl 'http://web/search?q=test%27+UNION+SELECT+flag+FROM+flags--'",
+                    mode="red",
+                )
+            )
+            assert "FLAG{test_sqli_123}" in red_obs.stdout
+            blue_obs = env.step(
+                RangeAction(
+                    command="grep UNION /var/log/siem/consolidated/all.log",
+                    mode="blue",
+                )
+            )
+            assert "SQLi pattern detected" in blue_obs.stdout
+            flag_obs = env.step(
+                RangeAction(command="submit_flag FLAG{test_sqli_123}", mode="red")
+            )
+            assert flag_obs.done is True
+            assert sample_snapshot_spec.flags[0].value in env.state.flags_found
+        finally:
+            env.close()
+class TestSyntheticTraceGenerator:
+    def test_export_jsonl_records_selected_roles(self, tmp_path, sample_snapshot_spec):
+        red = ScriptedAgent(
+            commands=[
+                "nmap -sV 10.0.1.0/24",
+                "submit_flag FLAG{test_sqli_123}",
+            ]
+        )
+        blue = ScriptedAgent(commands=["grep scan /var/log/siem/consolidated/all.log"])
+        generator = SyntheticTraceGenerator(
+            snapshot=sample_snapshot_spec,
+            red_agent=red,
+            blue_agent=blue,
+            max_steps=3,
+            randomize_flags=False,
+        )
+        output_path = tmp_path / "synthetic.jsonl"
+        logger, count = generator.export_jsonl(
+            output_path,
+            num_traces=1,
+            roles=("red", "blue"),
+        )
+        assert count == 2
+        assert len(logger.episodes) == 1
+        assert logger.episodes[0].outcome == "flag_captured"
+        records = [json.loads(line) for line in output_path.read_text().splitlines()]
+        assert {record["role"] for record in records} == {"red", "blue"}
+        assert all(record["messages"][0]["role"] == "system" for record in records)
+    def test_build_teacher_agents_falls_back_to_scripted_when_no_model(self):
+        red, blue = build_teacher_agents(teacher_model=None, roles=("red", "blue"))
+        assert isinstance(red, ScriptedRedAgent)
+        assert isinstance(blue, ScriptedBlueAgent)
+class TestLiteLLMSupport:
+    def test_codex_models_omit_temperature_and_enable_drop_params(self, monkeypatch):
+        captured: dict[str, object] = {}
+        def fake_completion(**kwargs):
+            captured.update(kwargs)
+            return types.SimpleNamespace(
+                choices=[
+                    types.SimpleNamespace(
+                        message=types.SimpleNamespace(content="```bash\nwhoami\n```")
+                    )
+                ]
+            )
+        monkeypatch.setitem(sys.modules, "litellm", types.SimpleNamespace(completion=fake_completion))
+        agent = LLMRangeAgent(model="azure/gpt-5.2-codex", temperature=0.6, max_tokens=64)
+        agent.reset("Return exactly one command: whoami", "red")
+        assert agent.act("Return exactly one command: whoami") == "whoami"
+        assert captured["drop_params"] is True
+        assert "temperature" not in captured
+    def test_non_codex_models_keep_temperature(self, monkeypatch):
+        captured: dict[str, object] = {}
+        def fake_completion(**kwargs):
+            captured.update(kwargs)
+            return types.SimpleNamespace(
+                choices=[
+                    types.SimpleNamespace(
+                        message=types.SimpleNamespace(content="echo ok")
+                    )
+                ]
+            )
+        monkeypatch.setitem(sys.modules, "litellm", types.SimpleNamespace(completion=fake_completion))
+        agent = LLMRangeAgent(model="openai/gpt-4o", temperature=0.4, max_tokens=32)
+        agent.reset("Return exactly one command: echo ok", "blue")
+        assert agent.act("Return exactly one command: echo ok") == "echo ok"
+        assert captured["temperature"] == 0.4
+        assert captured["drop_params"] is True
+class TestSyntheticCLI:
+    def test_cli_generates_jsonl_from_snapshot(self, tmp_path, sample_snapshot_spec):
+        runner = CliRunner()
+        snapshot_path = tmp_path / "spec.json"
+        snapshot_path.write_text(json.dumps(sample_snapshot_spec.model_dump(mode="python")))
+        output_path = tmp_path / "synthetic.jsonl"
+        result = runner.invoke(
+            cli,
+            [
+                "synthetic-data",
+                "--snapshot",
+                str(snapshot_path),
+                "--output",
+                str(output_path),
+                "--num-traces",
+                "1",
+                "--max-steps",
+                "3",
+                "--roles",
+                "red",
+                "--reward-threshold",
+                "-1",
+                "--static-flags",
+            ],
+        )
+        assert result.exit_code == 0, result.output
+        assert output_path.exists()
+        records = [json.loads(line) for line in output_path.read_text().splitlines()]
+        assert len(records) == 1
+        assert records[0]["role"] == "red"
+@pytest.mark.live_model
+def test_live_model_smoke_generates_a_synthetic_trace(tmp_path, sample_snapshot_spec):
+    if not all(
+        os.environ.get(name)
+        for name in ("AZURE_API_KEY", "AZURE_API_BASE", "AZURE_API_VERSION")
+    ):
+        pytest.skip("Azure LiteLLM environment variables not configured")
+    pytest.importorskip("litellm")
+    snapshot = sample_snapshot_spec.model_copy(deep=True)
+    snapshot.task.red_briefing = (
+        "Synthetic smoke test. Return exactly one shell command. First command: whoami"
+    )
+    snapshot.task.blue_briefing = "Monitor the range."
+    red, blue = build_teacher_agents(
+        teacher_model=os.environ.get("OPENRANGE_SYNTH_MODEL", "azure/gpt-5.2-codex"),
+        roles=("red",),
+        temperature=None,
+        max_tokens=96,
+    )
+    generator = SyntheticTraceGenerator(
+        snapshot=snapshot,
+        red_agent=red,
+        blue_agent=blue,
+        max_steps=1,
+        randomize_flags=False,
+    )
+    output_path = tmp_path / "live_synthetic.jsonl"
+    logger, count = generator.export_jsonl(
+        output_path,
+        num_traces=1,
+        roles=("red",),
+    )
+    assert count == 1
+    assert len(logger.episodes) == 1
+    assert logger.episodes[0].red_turns
+    assert logger.episodes[0].red_turns[0].action.strip()
+    assert output_path.exists()

uv.lock CHANGED Viewed

@@ -1950,6 +1950,9 @@ dev = [
     { name = "pytest" },
     { name = "pytest-asyncio" },
 ]
 training = [
     { name = "trl" },
     { name = "unsloth" },
@@ -1963,6 +1966,7 @@ requires-dist = [
     { name = "httpx", marker = "extra == 'dev'", specifier = ">=0.27" },
     { name = "jinja2", specifier = ">=3.1" },
     { name = "litellm", marker = "extra == 'builder'", specifier = ">=1.30" },
     { name = "openenv-core", extras = ["core"], specifier = ">=0.2.1" },
     { name = "pydantic", specifier = ">=2.0.0" },
     { name = "pytest", marker = "extra == 'dev'", specifier = ">=8.0" },
@@ -1972,7 +1976,7 @@ requires-dist = [
     { name = "unsloth", marker = "extra == 'training'" },
     { name = "uvicorn", specifier = ">=0.24.0" },
 ]
-provides-extras = ["dev", "training", "builder"]
 [[package]]
 name = "opentelemetry-api"

     { name = "pytest" },
     { name = "pytest-asyncio" },
 ]
+synthetic = [
+    { name = "litellm" },
+]
 training = [
     { name = "trl" },
     { name = "unsloth" },
     { name = "httpx", marker = "extra == 'dev'", specifier = ">=0.27" },
     { name = "jinja2", specifier = ">=3.1" },
     { name = "litellm", marker = "extra == 'builder'", specifier = ">=1.30" },
+    { name = "litellm", marker = "extra == 'synthetic'", specifier = ">=1.30" },
     { name = "openenv-core", extras = ["core"], specifier = ">=0.2.1" },
     { name = "pydantic", specifier = ">=2.0.0" },
     { name = "pytest", marker = "extra == 'dev'", specifier = ">=8.0" },
     { name = "unsloth", marker = "extra == 'training'" },
     { name = "uvicorn", specifier = ">=0.24.0" },
 ]
+provides-extras = ["dev", "training", "builder", "synthetic"]
 [[package]]
 name = "opentelemetry-api"