File size: 18,008 Bytes
cebc7ff
 
 
 
a9ac5cf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cebc7ff
a9ac5cf
 
 
8c486a8
a9ac5cf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cebc7ff
 
a9ac5cf
 
 
 
 
 
 
 
 
 
 
 
 
cebc7ff
 
a9ac5cf
 
 
 
 
 
 
 
 
 
 
 
 
 
8c486a8
 
 
 
 
 
 
 
 
 
 
 
a9ac5cf
 
 
 
 
 
8c486a8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a9ac5cf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cebc7ff
 
819cfef
cebc7ff
819cfef
 
 
 
 
 
 
cebc7ff
a9ac5cf
cebc7ff
a9ac5cf
cebc7ff
a9ac5cf
 
 
 
 
 
 
 
 
8c10e98
cebc7ff
 
 
8c486a8
cebc7ff
 
8c486a8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cebc7ff
 
8c486a8
 
cebc7ff
a9ac5cf
819cfef
 
 
 
 
 
 
 
 
 
 
 
 
 
8c10e98
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8c486a8
 
 
 
 
 
 
 
 
 
fe0a566
a9ac5cf
8c486a8
fe0a566
 
 
8c486a8
a9ac5cf
 
8c486a8
fe0a566
 
 
 
 
 
 
 
 
 
 
8c486a8
 
 
 
 
 
a9ac5cf
 
fe0a566
 
 
 
 
 
8c486a8
fe0a566
 
 
 
8c486a8
fe0a566
 
 
8c486a8
 
 
 
 
 
 
 
 
 
 
fe0a566
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8c486a8
 
 
fe0a566
 
 
 
 
 
 
 
 
 
a9ac5cf
8c486a8
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
# Architecture

## System Overview

OpenRange uses a **snapshot-based architecture**. A manifest defines a legal family of company worlds. A builder/mutator (LLM-driven via LiteLLM) proposes candidate snapshots inside that family. A validator gate (purely mechanical) admits only snapshots that boot, remain coherent, and are actually solvable. `reset()` selects a frozen validated snapshot for the next episode. Mutation happens asynchronously between episodes.

```mermaid
flowchart LR
    A[Base company family<br/>AcmeCorp repo, infra, docs, tools] --> B
    M[Manifest / mutation policy<br/>allowed services, bug families, task families] --> B
    S[Curriculum / failure stats<br/>what Red or Blue is weak at] --> B

    B[Builder / mutator via LiteLLM<br/>generate next snapshot<br/>graph mutations, bug seeding, task derivation] --> C

    subgraph C [Candidate snapshot artifacts]
        C1[Topology graph<br/>hosts, services, users, trust edges]
        C2[Truth graph<br/>bug, exploit chain, blast radius, remediation]
        C3[Evidence spec<br/>logs, alerts, files, tickets, docs]
        C4[Task set<br/>exploit, investigate, patch, report]
    end

    C --> D
    D{Validator gate<br/>build/run, exploitability,<br/>patchability, evidence, reward} -->|fail| B
    D -->|pass| E[Frozen validated snapshot<br/>Acme v_k]

    subgraph R [OpenEnv runtime]
        F[reset - select frozen snapshot + sample task]
        G[Red / Blue agents]
        H[step - run command on frozen snapshot]
        I[Observation + reward]
        J[Rollout results]
    end

    E --> F
    F --> G
    G --> H
    H --> I
    I --> G
    H --> J

    J -. async evolve next snapshot .-> S

    style A fill:#4a9eff,color:#fff
    style M fill:#4a9eff,color:#fff
    style B fill:#ff6b6b,color:#fff
    style D fill:#ffd93d,color:#333
    style E fill:#6bcb77,color:#fff
    style R fill:#7c73e611,stroke:#7c73e6
```

## Key Principle

**LLM generates, rules validate.** The builder/mutator uses LiteLLM (any model -- Claude, GPT-4o, open models) to generate snapshots creatively. The validator gate runs a 10-check admission pipeline: 8 mechanical checks (deterministic, no LLM) plus 2 LLM advisory checks (configurable, removable). Advisory failures can trigger retry but never override a mechanical pass. Rewards are grounded in container state, never LLM-evaluated.

## Infrastructure

**Everything runs in Docker Compose.** The OpenEnv server is a container in the same compose stack as the range. It communicates with range containers via the Docker SDK (mounted `/var/run/docker.sock`).

### Tier 1 Containers (8 total)

| Container | Zone | Services | Role |
|-----------|------|----------|------|
| `attacker` | external | kali tools, nmap, sqlmap, hydra | Red agent's execution environment |
| `firewall` | perimeter | iptables, NAT, port forwarding, IDS rules | Network segmentation between zones |
| `web` | DMZ | nginx, PHP/Python app, sshd | Public-facing web application |
| `mail` | DMZ | postfix SMTP, dovecot IMAP | Email server with user mailboxes |
| `db` | internal | MySQL/PostgreSQL, app schemas, flag data | Database backend for web + mail |
| `files` | internal | samba, SMB shares, sensitive documents | File server with access controls |
| `ldap` | management | OpenLDAP, Kerberos, user directory | Authentication and authorization for all services |
| `siem` | management | rsyslog, log aggregation, alert rules | Blue agent's entry point, receives all logs |

### Network Zones

```mermaid
flowchart LR
    subgraph ext [External]
        ATK[attacker]
    end
    subgraph perim [Perimeter]
        FW[firewall]
    end
    subgraph dmz [DMZ - 10.0.1.0/24]
        WEB[web]
        MAIL[mail]
    end
    subgraph int [Internal - 10.0.2.0/24]
        DB[db]
        FILES[files]
    end
    subgraph mgmt [Mgmt - 10.0.3.0/24]
        LDAP[ldap]
        SIEM[siem]
    end

    ATK -->|80,443,25 only| FW
    FW --> dmz
    dmz -->|3306,445| int
    dmz -->|389,636| mgmt
    int -->|389| mgmt
    int -.->|blocked| ext

    style ext fill:#ff6b6b22,stroke:#ff6b6b
    style perim fill:#ffd93d22,stroke:#ffd93d
    style dmz fill:#4a9eff22,stroke:#4a9eff
    style int fill:#6bcb7722,stroke:#6bcb77
    style mgmt fill:#7c73e622,stroke:#7c73e6
```

### Service Interconnections

Every service is real and talks to other services:

- **web** authenticates users against **ldap**, queries **db** for app data, logs to **siem**
- **mail** does user lookup against **ldap**, stores mailboxes locally, logs to **siem**
- **files** authorizes SMB access via **ldap**, logs to **siem**
- **db** accepts connections from **web** and **files**, logs queries to **siem**
- **ldap** provides auth for all services, replicates to **siem** for audit
- **siem** aggregates logs from all hosts -- Blue agent reads these
- **firewall** enforces zone boundaries, logs blocked/allowed traffic to **siem**
- **attacker** has no access to anything except through the **firewall**

## Data Flow

### Snapshot Creation (asynchronous, between episodes)

```mermaid
sequenceDiagram
    participant M as Manifest
    participant B as Builder LLM via LiteLLM
    participant V as Validator Gate
    participant SS as Snapshot Store

    M->>B: Company family + mutation policy + curriculum stats
    B->>B: Generate candidate snapshot
    Note over B: Topology graph, truth graph,<br/>evidence spec, task set,<br/>Dockerfiles, configs, app code

    B->>V: Candidate snapshot artifacts
    V->>V: 1. BuildBootCheck: compose up + healthchecks
    V->>V: 2. ExploitabilityCheck: golden path end-to-end
    V->>V: 3. PatchabilityCheck: inverse mutation test
    V->>V: 4. EvidenceSufficiencyCheck: logs + alerts exist
    V->>V: 5. RewardGroundingCheck: rubrics produce valid scores
    V->>V: 6. IsolationLeakageCheck: zones enforced, no flag leaks
    V->>V: 7. TaskFeasibilityCheck: hosts/evidence/vulns reachable
    V->>V: 8. DifficultyCheck: golden path steps Β±20% of tier target
    V->>V: 9. NPCConsistencyCheck: personas pass phish calibration (LLM, advisory)
    V->>V: 10. RealismReviewCheck: scenario plausibility (LLM, advisory)

    alt All mechanical checks pass
        V->>SS: Publish Acme v_k
    else Any check fails
        V-->>B: Reject with failure context, retry
    end
```

### Rendering Pipeline

Between validation and the episode loop, the `SnapshotRenderer` (`builder/renderer.py`) converts a validated `SnapshotSpec` into concrete Docker artifacts via Jinja2 templates:

```
SnapshotSpec (Pydantic model)
    |
    v
SnapshotRenderer.render(spec, output_dir)
    |
    v
Jinja2 templates (builder/templates/*.j2)
    |
    v
Docker artifacts:
  - docker-compose.yml
  - Dockerfile.web, Dockerfile.db
  - nginx.conf
  - init.sql
  - iptables.rules
```

The renderer flattens SnapshotSpec fields (topology, zones, hosts, flags, vuln types) into a template context. Templates use conditional blocks (e.g., `search_endpoint`, `download_endpoint`) driven by the snapshot's vulnerability types and injection points.

### Episode Loop (synchronous, standard OpenEnv)

```mermaid
sequenceDiagram
    participant T as Training Loop
    participant E as OpenEnv Server
    participant SS as Snapshot Store
    participant R as Range Containers

    T->>E: reset()
    E->>SS: Select validated snapshot + sample task
    SS-->>E: Frozen snapshot Acme v_k
    E->>R: Boot or restore snapshot containers
    E-->>T: RangeObservation with task briefing

    loop Red and Blue alternate
        T->>E: step Red action
        E->>R: docker exec attacker ...
        R-->>E: stdout + stderr
        E-->>T: RangeObservation + reward

        T->>E: step Blue action
        E->>R: docker exec siem ...
        R-->>E: log data
        E-->>T: RangeObservation + reward
    end

    Note over T,R: Episode ends: flag captured, max steps, or timeout
```

### Curriculum Feedback

The Builder acts as a **simulated expert curriculum designer**. Episode results feed back to shape future snapshots:

1. Track Red solve rate and Blue detection rate per snapshot (per vuln class, per tier)
2. Feed failure stats to Builder as `runtime_context` on next build
3. Builder LLM adjusts difficulty via `r_inject = 1 - (1+alpha)*s` (frontier calibration from SWE-RL)
4. Target agent weaknesses: if Red masters SQLi, seed SSRF or chained vulns next
5. When agents plateau: horizontal growth (add containers, zones, services)

## Snapshot Artifacts

Each validated snapshot contains:

| Artifact | What it is | Example |
|----------|-----------|---------|
| **Topology graph** | Hosts, services, users, network zones, trust edges | 8 containers, 4 zones, 12 users, firewall rules |
| **Truth graph** | Bug location, exploit chain, blast radius, remediation steps | SQLi in web/search.php, chain to DB creds, fix: parameterize |
| **Evidence spec** | Logs, alerts, files, tickets that Blue can investigate | web access logs show injection patterns, SIEM alert rule fires |
| **Task set** | What Red and Blue are asked to do | Red: find and exploit vulns. Blue: detect, investigate, patch |
| **Docker artifacts** | Compose file, Dockerfiles, configs, app code, flag files | docker-compose.yml, nginx.conf, init.sql, app.php |
| **Golden path** | Step-by-step exploit chain with expected outputs | nmap, curl, sqlmap, pivot, submit_flag |
| **NPC traffic config** | Background traffic rates per service | HTTP: 30/min, SMTP: 5/min, LDAP: 10/min |
| **NPC persona cards** | LLM-driven employee personas with susceptibility profiles | Janet Smith (Marketing, security_awareness=0.3, phishing_susceptibility=0.7) |

## Reward Architecture

All rewards live in `server/rewards.py`. Composite classes use a weighted sum with configurable weights, plus tier-based complexity scaling and additive penalties.

```
CompositeRedReward
  β”œβ”€β”€ FlagReward                binary, docker exec verified (weight: 0.40)
  β”œβ”€β”€ EfficiencyReward          gamma^steps (weight: 0.10)
  β”œβ”€β”€ StealthReward             1 - (detected/total), coupled to Blue (weight: 0.15)
  β”œβ”€β”€ EvidenceReward            structural quality of submit_evidence (weight: 0.10)
  β”œβ”€β”€ SocialEngineeringReward   NPC fell for phish/pretext, Level 1+ (weight: 0.05)
  β”œβ”€β”€ ComplexityBonus           tier_multiplier * base_reward (1.0x-3.0x)
  └── HallucinationPenalty      -0.3 per fake flag (additive, not tier-scaled)

CompositeBlueReward
  β”œβ”€β”€ DetectionReward           TP rate vs Red action log (weight: 0.35)
  β”œβ”€β”€ PatchReward               binary, golden path re-execution (weight: 0.25)
  β”œβ”€β”€ AvailabilityReward        healthcheck fraction (weight: 0.15)
  β”œβ”€β”€ PhishingDetectionReward   correctly identified SE in logs, Level 1+ (weight: 0.05)
  β”œβ”€β”€ ComplexityBonus           tier_multiplier * base_reward (1.0x-3.0x)
  └── FalsePositiveReward       -0.2 per NPC traffic/email flagged (additive, not tier-scaled)
```

Both composite classes expose a `compute(action, observation, state, snapshot, ctx)` method called by `RangeEnvironment.step()`, as well as a lower-level `score()` with explicit arguments for unit testing.

Rewards are computed from **container state and action logs**, never from LLM judgment.

### Tier-Scaled Reward Ceiling

Reward ceilings scale with environment complexity so that harder snapshots produce proportionally larger training signals:

| Tier | Hosts | Multiplier | Max Red Reward | Max Blue Reward |
|------|-------|-----------|----------------|-----------------|
| 1 | 6-8 | 1.0x | 1.0 | 1.0 |
| 2 | 10-12 | 1.5x | 1.5 | 1.5 |
| 3 | 14-18 | 2.0x | 2.0 | 2.0 |
| 4 | 20-25 | 2.5x | 2.5 | 2.5 |
| 5 | 30+ | 3.0x | 3.0 | 3.0 |

This ensures agents are incentivized to attempt harder environments rather than grinding easy Tier 1 snapshots.

## NPC Evolution: Shell Scripts to LLM Agents

NPCs progress from mechanical noise generators to intelligent social engineering targets. Each level adds a modality without removing the previous one.

```mermaid
flowchart TB
    subgraph l0 [Level 0 - Shell Scripts]
        SH[curl/sshpass/mysql loops<br/>Configurable rates per service<br/>Labeled for FP scoring]
    end

    subgraph l1 [Level 1 - LLM Email NPCs]
        PERSONA[Persona cards from Builder<br/>name, role, security_awareness,<br/>susceptibility profile]
        MAIL_NPC[Async LLM agent checks Postfix<br/>Decides: click, reply, ignore, report<br/>Creates realistic log trail]
        PERSONA --> MAIL_NPC
    end

    subgraph l2 [Level 2+ - Multimodal Stretch]
        CHAT[Internal chat/IM]
        VOICE[Voice via TTS/STT]
        DOCS[Document inspection via vision LLM]
    end

    l0 -->|always running| l1
    l1 -->|adds social engineering surface| l2

    style l0 fill:#6bcb7722,stroke:#6bcb77
    style l1 fill:#4a9eff22,stroke:#4a9eff
    style l2 fill:#7c73e622,stroke:#7c73e6
```

**Key design**: NPC LLM calls are **async, not in the step() hot path**. Red sends a phishing email to Postfix in one step. The NPC agent processes it on its own schedule (per `email_check_interval_min`). Red observes the result in later steps via access logs, new sessions, or SIEM alerts. Blue sees the same logs and must distinguish legitimate NPC-to-NPC email from Red's social engineering.

**Implementations** (all in `builder/npc/npc_agent.py`):

| Class | Level | LLM? | Description |
|-------|-------|------|-------------|
| `NullNPCBehavior` | 0 | No | No-op; always returns `ignore`. Shell scripts handle all traffic. |
| `RuleBasedNPCBehavior` | 0-1 | No | Heuristic decisions based on `susceptibility * plausibility` score thresholds. |
| `LLMNPCAgent` | 1+ | Yes | Full LLM-driven persona. Runs an async `run_loop()` polling for stimuli on the persona's schedule. |

The `NPCManager` (`builder/npc/npc_manager.py`) orchestrates both levels: it starts Level 0 shell scripts (`http_traffic.sh`, `db_traffic.sh`, `ssh_traffic.sh`) and, when `npc_traffic.level >= 1`, spawns `LLMNPCAgent.run_loop()` as asyncio tasks for each persona.

## Pluggable Infrastructure Components

Builder, NPC behavior, validator checks, and Red/Blue agents are all **pluggable via Protocol-based structural subtyping**. No base class inheritance required. Any class with a matching method signature satisfies the protocol.

See [`docs/agent-protocols.md`](agent-protocols.md) for the full design.

### Four Protocols

```python
# protocols.py β€” infrastructure components
@runtime_checkable
class SnapshotBuilder(Protocol):
    async def build(self, manifest: dict, context: BuildContext) -> SnapshotSpec: ...

@runtime_checkable
class NPCBehavior(Protocol):
    async def decide(self, persona: NPCPersona, stimulus: Stimulus) -> NPCAction: ...

@runtime_checkable
class ValidatorCheck(Protocol):
    async def check(self, snapshot: SnapshotSpec, containers: ContainerSet) -> CheckResult: ...

# agents/protocol.py β€” Red/Blue agents
@runtime_checkable
class RangeAgent(Protocol):
    def reset(self, briefing: str, role: Literal["red", "blue"]) -> None: ...
    def act(self, observation: str) -> str: ...
```

### Configuration via YAML

```yaml
# openrange.yaml
agents:
  builder:
    class: open_range.builder.builder.LLMSnapshotBuilder
    kwargs:
      model: "anthropic/claude-sonnet-4-20250514"
      temperature: 0.7
  npc_behavior:
    class: open_range.builder.npc.npc_agent.LLMNPCAgent
    kwargs:
      model: "anthropic/claude-haiku-4-5-20251001"
  validator_checks:
    - class: open_range.validator.build_boot.BuildBootCheck
    - class: open_range.validator.exploitability.ExploitabilityCheck
    - class: open_range.validator.patchability.PatchabilityCheck
    - class: open_range.validator.evidence.EvidenceSufficiencyCheck
    - class: open_range.validator.reward_grounding.RewardGroundingCheck
    - class: open_range.validator.isolation.IsolationLeakageCheck
    - class: open_range.validator.task_feasibility.TaskFeasibilityCheck
    - class: open_range.validator.difficulty.DifficultyCheck
    - class: open_range.validator.npc_consistency.NPCConsistencyCheck
    - class: open_range.validator.realism_review.RealismReviewCheck
    # add, remove, or reorder checks as needed
```

### Resolution

Dynamic import + Protocol check at startup:

```python
def resolve_component(class_path: str, kwargs: dict, protocol: type) -> Any:
    module_name, _, class_name = class_path.rpartition(".")
    module = importlib.import_module(module_name)
    cls = getattr(module, class_name)
    instance = cls(**kwargs)
    if not isinstance(instance, protocol):
        raise TypeError(f"{class_path} does not satisfy {protocol.__name__}")
    return instance
```

### Default Implementations

| Protocol | Default | Alternatives |
|----------|---------|-------------|
| `SnapshotBuilder` | `LLMSnapshotBuilder` (LiteLLM) | `TemplateOnlyBuilder` (testing), `FileBuilder` (demo) |
| `NPCBehavior` | `NullNPCBehavior` (Level 0, no-op) | `LLMNPCAgent` (Level 1+, LiteLLM), `RuleBasedNPCBehavior` (heuristic, no LLM) |
| `ValidatorCheck` | 8 mechanical + 2 LLM advisory | Add, remove, or reorder via config |
| `RangeAgent` | `ScriptedAgent` (replay commands) | `LLMRangeAgent` (LiteLLM), `HumanAgent` (interactive stdin), `ScriptedRedAgent`/`ScriptedBlueAgent` (pre-built demo sequences) |

### Environment Variables

Env vars override YAML config at deploy time:

| Env Var | Overrides | Default |
|---------|-----------|---------|
| `OPENRANGE_BUILDER_MODEL` | Builder LLM model | `anthropic/claude-sonnet-4-20250514` |
| `OPENRANGE_NPC_MODEL` | NPC LLM model | `anthropic/claude-haiku-4-5-20251001` |
| `LITELLM_API_KEY` | Global API key | (or model-specific keys) |

Checks 1-8 are **purely mechanical** -- deterministic, no LLM. Check 9 (NPC consistency) uses an LLM for NPC persona testing. Check 10 (realism review) is an LLM advisory check. Both LLM checks are advisory (`advisory=True`): failure triggers retry but never blocks admission. Both are configurable -- remove them from the validator_checks list to run fully mechanical.