Spaces:
Runtime error
Agent Protocols: Pluggable Components
Design Principle
OpenRange has four pluggable Protocol-based components:
| Component | Role | Hot Path? | Default |
|---|---|---|---|
| Builder | Generate snapshot specs from manifests | No (async between episodes) | LLMSnapshotBuilder via LiteLLM |
| NPC Behavior | Decide NPC response to stimuli | No (async on NPC schedule) | LLMNPCAgent via LiteLLM |
| Validator Checks | Admission gate checks | No (async between episodes) | 6 mechanical + 2 LLM advisory |
| RangeAgent | Red/Blue agent playing in episodes | Yes (in episode step loop) | LLMRangeAgent via LiteLLM |
The first three are infrastructure components that happen to use LLMs. RangeAgent is the training/evaluation agent interface (Red or Blue).
All four follow the same pluggability pattern:
- Protocol defines the interface (structural subtyping, no inheritance)
- Default implementation uses LiteLLM for model-agnostic LLM access
- Configuration via YAML manifest (class path + kwargs)
- Resolution via dynamic import + Protocol check at startup
Protocols
All protocols are defined in src/open_range/protocols.py (infrastructure) and src/open_range/agents/protocol.py (RangeAgent).
from typing import Literal, Protocol, runtime_checkable
# ---------------------------------------------------------------------------
# Builder β generates candidate snapshot specs
# src/open_range/protocols.py
# ---------------------------------------------------------------------------
@runtime_checkable
class SnapshotBuilder(Protocol):
"""Generate a candidate snapshot spec from a manifest."""
async def build(
self,
manifest: dict,
context: BuildContext,
) -> SnapshotSpec: ...
# ---------------------------------------------------------------------------
# NPC Behavior β decides NPC response to stimuli
# src/open_range/protocols.py
# ---------------------------------------------------------------------------
@runtime_checkable
class NPCBehavior(Protocol):
"""Decide how an NPC responds to a stimulus."""
async def decide(
self,
persona: NPCPersona,
stimulus: Stimulus,
) -> NPCAction: ...
# ---------------------------------------------------------------------------
# Validator Check β single admission check in the validation pipeline
# src/open_range/protocols.py
# ---------------------------------------------------------------------------
@runtime_checkable
class ValidatorCheck(Protocol):
"""Single check in the validator admission pipeline."""
async def check(
self,
snapshot: SnapshotSpec,
containers: ContainerSet,
) -> CheckResult: ...
# ---------------------------------------------------------------------------
# RangeAgent β Red or Blue agent playing in episodes
# src/open_range/agents/protocol.py
# ---------------------------------------------------------------------------
@runtime_checkable
class RangeAgent(Protocol):
"""Agent that can play Red or Blue in OpenRange.
NOTE: Methods are synchronous (not async), unlike the infrastructure
protocols above. This keeps the agent interface simple for training
integrations.
"""
def reset(self, briefing: str, role: Literal["red", "blue"]) -> None:
"""Initialize agent for a new episode.
Args:
briefing: Task description from the snapshot.
role: Which side this agent plays ("red" or "blue").
"""
...
def act(self, observation: str) -> str:
"""Given an observation, return the next command to execute.
Args:
observation: stdout from the previous step, or initial briefing.
Returns:
Shell command string (e.g. "nmap -sV 10.0.1.0/24").
"""
...
Default Implementations
RangeAgent
| Implementation | File | When to use | LLM? |
|---|---|---|---|
LLMRangeAgent |
src/open_range/agents/llm_agent.py |
Production β model-agnostic via LiteLLM | Yes (LiteLLM) |
ScriptedAgent |
src/open_range/agents/replay_agent.py |
Testing/CI/demos β replays fixed command list | No |
HumanAgent |
src/open_range/agents/human_agent.py |
Manual play/debugging β stdin/stdout | No |
class LLMRangeAgent:
"""Generic agent powered by any LiteLLM model."""
def __init__(
self,
model: str = "anthropic/claude-sonnet-4-20250514",
temperature: float = 0.3,
max_tokens: int = 512,
**litellm_kwargs, # e.g. api_base, api_key
) -> None: ...
def reset(self, briefing: str, role: Literal["red", "blue"]) -> None:
"""Initialize conversation history with role-specific system prompt."""
...
def act(self, observation: str) -> str:
"""Call litellm.completion, extract shell command from response."""
...
class ScriptedAgent:
"""Replays a fixed list of commands. After exhaustion, repeats fallback."""
def __init__(
self,
commands: list[str] | None = None,
fallback: str = "echo done",
) -> None: ...
def reset(self, briefing: str, role: Literal["red", "blue"]) -> None: ...
def act(self, observation: str) -> str: ...
class HumanAgent:
"""Interactive agent: prints observations, reads commands from stdin."""
def __init__(self, prompt: str = "Enter command > ") -> None: ...
def reset(self, briefing: str, role: Literal["red", "blue"]) -> None: ...
def act(self, observation: str) -> str: ...
Pre-built demo agents are also available as ScriptedRedAgent and ScriptedBlueAgent in src/open_range/agents/replay_agent.py.
Builder
| Implementation | File | When to use | LLM? |
|---|---|---|---|
LLMSnapshotBuilder |
src/open_range/builder/builder.py |
Production β creative snapshot generation | Yes (LiteLLM) |
TemplateOnlyBuilder |
src/open_range/builder/builder.py |
Testing/CI β deterministic, no API calls | No |
FileBuilder |
src/open_range/builder/builder.py |
Demo β load pre-built snapshot from JSON file | No |
class LLMSnapshotBuilder:
"""Default builder: uses LiteLLM to generate snapshot specs."""
def __init__(
self,
model: str | None = None,
prompt_template: str | None = None,
temperature: float = 0.7,
max_retries: int = 3,
):
self.model = model or os.environ.get(
"OPENRANGE_BUILDER_MODEL", "anthropic/claude-sonnet-4-20250514"
)
...
async def build(self, manifest: dict, context: BuildContext) -> SnapshotSpec: ...
class TemplateOnlyBuilder:
"""Deterministic builder for testing. No LLM calls."""
def __init__(self, vuln_pool: list[dict] | None = None): ...
async def build(self, manifest: dict, context: BuildContext) -> SnapshotSpec: ...
class FileBuilder:
"""Load a pre-built snapshot from disk. For demos and smoke tests."""
def __init__(self, snapshot_dir: str): ...
async def build(self, manifest: dict, context: BuildContext) -> SnapshotSpec: ...
NPC Behavior
All NPC implementations live in src/open_range/builder/npc/npc_agent.py.
| Implementation | When to use | LLM? |
|---|---|---|
LLMNPCAgent |
Level 1+ β persona-driven decisions | Yes (LiteLLM) |
RuleBasedNPCBehavior |
Mid-ground β heuristic susceptibility checks | No |
NullNPCBehavior |
Level 0 β shell scripts handle everything | No |
class LLMNPCAgent:
"""Async LLM NPC agent that responds to stimuli based on persona.
Also has a run_loop() method for polling a mailbox on a schedule
(not part of the NPCBehavior protocol, but useful for live episodes).
"""
def __init__(
self,
model: str | None = None,
temperature: float = 0.3,
) -> None:
self.model = model or os.environ.get(
"OPENRANGE_NPC_MODEL", "anthropic/claude-haiku-4-5-20251001"
)
...
async def decide(self, persona: NPCPersona, stimulus: Stimulus) -> NPCAction: ...
async def run_loop(self, persona: NPCPersona, containers: ContainerSet) -> None: ...
class RuleBasedNPCBehavior:
"""Heuristic NPC decisions based on susceptibility scores. No LLM calls."""
async def decide(self, persona: NPCPersona, stimulus: Stimulus) -> NPCAction:
susceptibility = persona.susceptibility.get(
f"{stimulus.type}", persona.susceptibility.get("phishing_email", 0.5)
)
score = stimulus.plausibility * susceptibility
if persona.security_awareness > 0.7 and score < 0.8:
return NPCAction(action="report_to_IT", ...)
elif score > 0.6:
return NPCAction(action="click_link", ...)
elif score > 0.3:
return NPCAction(action="ignore")
else:
return NPCAction(action="report_to_IT", ...)
class NullNPCBehavior:
"""No-op. Level 0 shell scripts handle all NPC traffic."""
async def decide(self, persona: NPCPersona, stimulus: Stimulus) -> NPCAction:
return NPCAction(action="ignore")
Validator Checks
Each check is a separate class in src/open_range/validator/. The validator pipeline is a list of checks -- add, remove, or reorder via config.
# Mechanical checks (no LLM)
class BuildBootCheck: # validator/build_boot.py β docker compose up + healthchecks
async def check(self, snapshot, containers) -> CheckResult: ...
class ExploitabilityCheck: # validator/exploitability.py β golden path end-to-end
async def check(self, snapshot, containers) -> CheckResult: ...
class PatchabilityCheck: # validator/patchability.py β inverse mutation test
async def check(self, snapshot, containers) -> CheckResult: ...
class EvidenceCheck: # validator/evidence.py β logs + alerts exist
async def check(self, snapshot, containers) -> CheckResult: ...
class RewardGroundingCheck: # validator/reward_grounding.py β rubrics produce valid scores
async def check(self, snapshot, containers) -> CheckResult: ...
class IsolationCheck: # validator/isolation.py β zones enforced, no leaks
async def check(self, snapshot, containers) -> CheckResult: ...
class TaskFeasibilityCheck: # validator/task_feasibility.py β tasks reference real hosts
async def check(self, snapshot, containers) -> CheckResult: ...
class DifficultyCheck: # validator/difficulty.py β golden path steps within tier target
async def check(self, snapshot, containers) -> CheckResult: ...
# LLM checks (advisory β failure triggers retry, never blocks)
class NPCConsistencyCheck: # validator/npc_consistency.py
"""Tests NPC personas with calibrated phishing stimuli via LLM."""
def __init__(self, model: str | None = None):
self.model = model or os.environ.get(
"OPENRANGE_NPC_MODEL", "anthropic/claude-haiku-4-5-20251001"
)
async def check(self, snapshot, containers) -> CheckResult: ...
class RealismReviewCheck: # validator/realism_review.py
"""LLM-based realism review. Advisory only β can trigger retry,
never overrides mechanical pass. Remove from check list to skip."""
def __init__(self, model: str | None = None):
self.model = model or os.environ.get(
"OPENRANGE_VALIDATOR_MODEL", "anthropic/claude-haiku-4-5-20251001"
)
async def check(self, snapshot, containers) -> CheckResult: ...
Configuration
All component selection happens in the manifest YAML (or a separate openrange.yaml). This keeps everything in one place and version-controllable. Class paths are fully-qualified dotted Python paths.
# openrange.yaml β component configuration
agents:
builder:
class: open_range.builder.builder.LLMSnapshotBuilder
kwargs:
model: "anthropic/claude-sonnet-4-20250514"
prompt_template: "prompts/builder_v2.txt"
temperature: 0.7
max_retries: 3
npc_behavior:
class: open_range.builder.npc.npc_agent.LLMNPCAgent
kwargs:
model: "anthropic/claude-haiku-4-5-20251001"
temperature: 0.3
validator_checks:
- class: open_range.validator.build_boot.BuildBootCheck
- class: open_range.validator.exploitability.ExploitabilityCheck
- class: open_range.validator.patchability.PatchabilityCheck
- class: open_range.validator.evidence.EvidenceCheck
- class: open_range.validator.reward_grounding.RewardGroundingCheck
- class: open_range.validator.isolation.IsolationCheck
- class: open_range.validator.task_feasibility.TaskFeasibilityCheck
- class: open_range.validator.difficulty.DifficultyCheck
- class: open_range.validator.npc_consistency.NPCConsistencyCheck # LLM advisory
kwargs:
model: "anthropic/claude-haiku-4-5-20251001"
- class: open_range.validator.realism_review.RealismReviewCheck # LLM advisory, remove to skip
kwargs:
model: "anthropic/claude-haiku-4-5-20251001"
Override via Environment Variables
LiteLLM model strings can always be overridden by env vars (useful for CI, testing, different providers):
| Env Var | Overrides | Example |
|---|---|---|
OPENRANGE_BUILDER_MODEL |
Builder model | gpt-4o, ollama/llama3, anthropic/claude-sonnet-4-20250514 |
OPENRANGE_NPC_MODEL |
NPC model | anthropic/claude-haiku-4-5-20251001, ollama/phi3 |
LITELLM_API_KEY |
Global API key | (or model-specific: ANTHROPIC_API_KEY, OPENAI_API_KEY) |
Env vars take precedence over YAML config. This lets you define the architecture in YAML but swap models at deploy time.
Testing Profile
# openrange-test.yaml β no LLM calls, deterministic
agents:
builder:
class: open_range.builder.builder.TemplateOnlyBuilder
kwargs:
vuln_pool: "vulns/test_pool.json"
npc_behavior:
class: open_range.builder.npc.npc_agent.NullNPCBehavior
validator_checks:
- class: open_range.validator.build_boot.BuildBootCheck
# Skip slow checks in tests
Demo Profile
# openrange-demo.yaml β pre-built snapshots, fast resets
agents:
builder:
class: open_range.builder.builder.FileBuilder
kwargs:
snapshot_dir: "snapshots/demo/"
npc_behavior:
class: open_range.builder.npc.npc_agent.RuleBasedNPCBehavior
validator_checks: [] # Pre-validated snapshots, skip validation
Resolution
Dynamic import with Protocol check at startup. Defined in src/open_range/resolve.py.
import importlib
from typing import Any, Type
def resolve_component(class_path: str, kwargs: dict, protocol: Type) -> Any:
"""Import class by dotted path, instantiate, verify protocol compliance.
Args:
class_path: e.g. "open_range.builder.builder.LLMSnapshotBuilder"
kwargs: Constructor keyword arguments
protocol: Protocol class to check against
Returns:
Instantiated component satisfying the protocol.
Raises:
TypeError: If the class doesn't satisfy the protocol.
ImportError: If the module can't be imported.
AttributeError: If the class doesn't exist in the module.
"""
module_name, _, class_name = class_path.rpartition(".")
module = importlib.import_module(module_name)
cls = getattr(module, class_name)
instance = cls(**kwargs)
if not isinstance(instance, protocol):
missing = _missing_methods(instance, protocol)
raise TypeError(
f"{class_path} does not satisfy {protocol.__name__} protocol. "
f"Missing methods: {missing}"
)
return instance
def load_agent_config(config_path: str) -> dict:
"""Load agent configuration from YAML. Returns the 'agents' block."""
path = Path(config_path)
if not path.exists():
return {}
with open(path) as f:
config = yaml.safe_load(f) or {}
return config.get("agents", {})
def build_components(config: dict) -> tuple[SnapshotBuilder, NPCBehavior, list[ValidatorCheck]]:
"""Resolve all infrastructure components from config dict.
Defaults when no config provided:
builder -> open_range.builder.builder.LLMSnapshotBuilder
npc -> open_range.builder.npc.npc_agent.NullNPCBehavior
checks -> DEFAULT_CHECKS (6 mechanical checks)
"""
builder_cfg = config.get("builder", {})
builder = resolve_component(
builder_cfg.get("class", "open_range.builder.builder.LLMSnapshotBuilder"),
builder_cfg.get("kwargs", {}),
SnapshotBuilder,
)
npc_cfg = config.get("npc_behavior", {})
npc = resolve_component(
npc_cfg.get("class", "open_range.builder.npc.npc_agent.NullNPCBehavior"),
npc_cfg.get("kwargs", {}),
NPCBehavior,
)
checks: list[ValidatorCheck] = []
for check_cfg in config.get("validator_checks", DEFAULT_CHECKS):
checks.append(resolve_component(
check_cfg["class"],
check_cfg.get("kwargs", {}),
ValidatorCheck,
))
return builder, npc, checks
The DEFAULT_CHECKS list (used when no validator_checks key is present in config) includes the 6 mechanical checks: BuildBootCheck, ExploitabilityCheck, PatchabilityCheck, EvidenceCheck, RewardGroundingCheck, IsolationCheck. The LLM advisory checks (NPCConsistencyCheck, RealismReviewCheck) and additional mechanical checks (TaskFeasibilityCheck, DifficultyCheck) must be explicitly added via config.
How Components Wire Together
flowchart TB
CONFIG[openrange.yaml<br/>agent class paths + kwargs] --> RESOLVE[resolve_component<br/>importlib + Protocol check]
RESOLVE --> BLD[SnapshotBuilder]
RESOLVE --> NPC[NPCBehavior]
RESOLVE --> VAL[ValidatorCheck x N]
subgraph snapshot_loop [Async Snapshot Creation]
BLD -->|build| SPEC[SnapshotSpec]
SPEC --> VAL
VAL -->|all pass| STORE[Snapshot Store]
VAL -->|any fail| BLD
end
subgraph episode [Episode Loop]
STORE -->|reset selects frozen snapshot| ENV[RangeEnvironment]
AGENT[RangeAgent<br/>LLM / Scripted / Human] -->|act| ENV
ENV -->|step| DOCKER[Docker containers]
DOCKER --> OBS[Observation + Reward]
OBS -->|observation| AGENT
end
subgraph npc_loop [Async NPC Loop]
STIM[Stimulus from Postfix] --> NPC
NPC -->|action| SIDE[Side effects<br/>click link, reply, report]
SIDE --> LOGS[Container logs<br/>visible to Blue]
end
style CONFIG fill:#4a9eff22,stroke:#4a9eff
style RESOLVE fill:#ffd93d22,stroke:#ffd93d
style snapshot_loop fill:#ff6b6b11,stroke:#ff6b6b
style episode fill:#6bcb7711,stroke:#6bcb77
style npc_loop fill:#7c73e611,stroke:#7c73e6
Extending: Bring Your Own Agent
Write a class with def reset(self, briefing, role) -> None and def act(self, observation) -> str. That's it.
# my_agent.py
class MyFineTunedAgent:
"""Uses a locally fine-tuned model as a Red/Blue agent."""
def __init__(self, model_path: str, device: str = "cuda"):
self.model = load_model(model_path, device)
def reset(self, briefing: str, role: Literal["red", "blue"]) -> None:
self.history = [briefing]
self.role = role
def act(self, observation: str) -> str:
self.history.append(observation)
return self.model.generate("\n".join(self.history))
No registration, no base class, no plugin system. Just match the Protocol signature.
Extending: Bring Your Own Builder
Write a class with async def build(self, manifest, context) -> SnapshotSpec. That's it.
# my_custom_builder.py
class FineTunedBuilder:
"""Uses a fine-tuned local model for snapshot generation."""
def __init__(self, model_path: str, device: str = "cuda"):
self.model = load_model(model_path, device)
async def build(self, manifest: dict, context: BuildContext) -> SnapshotSpec:
prompt = render_builder_prompt(manifest, context)
output = self.model.generate(prompt)
return SnapshotSpec.model_validate_json(output)
# openrange.yaml
agents:
builder:
class: my_custom_builder.FineTunedBuilder
kwargs:
model_path: "/models/builder-ft-v3"
device: "cuda:0"
No registration, no base class, no plugin system. Just match the Protocol signature and point the config at it.
Extending: Bring Your Own NPC
# my_npc.py
class VoiceNPC:
"""Level 3 NPC: processes voice stimuli via Whisper + LLM."""
def __init__(self, whisper_model: str = "base", llm_model: str = None):
self.whisper = load_whisper(whisper_model)
self.llm_model = llm_model or os.environ.get("OPENRANGE_NPC_MODEL")
async def decide(self, persona: NPCPersona, stimulus: Stimulus) -> NPCAction:
if stimulus.type == "voice":
text = self.whisper.transcribe(stimulus.audio_path)
stimulus = stimulus.model_copy(update={"content": text, "type": "text"})
# Fall through to LLM decision
return await llm_decide(self.llm_model, persona, stimulus)
Extending: Bring Your Own Validator Check
# my_checks.py
class CustomSecurityAudit:
"""Run a security scanner against the snapshot."""
def __init__(self, scanner: str = "trivy"):
self.scanner = scanner
async def check(self, snapshot, containers) -> CheckResult:
result = await containers.exec("attacker", f"{self.scanner} scan --severity HIGH")
high_vulns = parse_scanner_output(result)
return CheckResult(
passed=len(high_vulns) == 0,
details={"unintended_vulns": high_vulns},
)
agents:
validator_checks:
- class: open_range.validator.build_boot.BuildBootCheck
- class: open_range.validator.exploitability.ExploitabilityCheck
- class: my_checks.CustomSecurityAudit
kwargs:
scanner: "nuclei"
# ... rest of pipeline
Key Decisions
- Protocol over ABC: Structural subtyping means zero coupling. Your implementation doesn't import anything from OpenRange.
- YAML over code registration: Configuration is data, not code. Version it, diff it, override it per environment.
- Env vars override YAML: Deploy-time model swaps without touching config files.
- LiteLLM is the default, not the requirement: Default implementations use LiteLLM. Custom implementations can use anything β local models, fine-tuned checkpoints, even non-LLM approaches.
- Async for infrastructure, sync for agents: Infrastructure protocols (
SnapshotBuilder,NPCBehavior,ValidatorCheck) useasync def-- they are never in thestep()hot path.RangeAgentuses synchronousdeffor simpler training integration. - Validator checks are a list: Add, remove, reorder checks via config. No hardcoded pipeline.