Spaces:
Runtime error
Runtime error
File size: 11,144 Bytes
7529adc 3ea4118 7529adc 3ea4118 7529adc 3ea4118 7529adc cebc7ff 4a77f25 cebc7ff 4a77f25 cebc7ff 80ef9e0 cebc7ff 200a872 f549fda 80ef9e0 f549fda 4a77f25 5ef50e5 4a77f25 5ef50e5 4a77f25 cebc7ff fda4cbc cebc7ff fb68239 906af9d 4a77f25 8c486a8 cebc7ff fb68239 f016eb7 fda4cbc 819cfef 4a77f25 819cfef 4a77f25 cebc7ff f549fda cebc7ff 80ef9e0 906af9d 80ef9e0 cebc7ff fda4cbc b33db9f a72929a b33db9f a72929a b33db9f ecc152d cebc7ff 906af9d 8c486a8 4a77f25 8c486a8 4a77f25 8c486a8 f016eb7 4a77f25 8c486a8 fb68239 8c486a8 cebc7ff 8c486a8 4a77f25 8c486a8 4a77f25 cebc7ff 4a77f25 8c486a8 4a77f25 8c486a8 4a77f25 8c486a8 7fedc25 8c486a8 4a77f25 8c486a8 4a77f25 fb68239 4a77f25 8c486a8 cebc7ff 4a77f25 cebc7ff | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 | ---
title: OpenRange Environment Server
emoji: π―
colorFrom: red
colorTo: blue
sdk: docker
pinned: false
app_port: 8000
base_path: /web
tags:
- openenv
- rl-environment
---
# OpenRange
A multi-agent cybersecurity gymnasium on [OpenEnv](https://github.com/meta-pytorch/OpenEnv). Red and Blue agents train on validated enterprise networks that mutate between episodes.
---
## How It Works
A **manifest** declares a family of legal enterprise worlds β topology, services, identities, trust relationships, vulnerability classes, and mutation bounds. A shared **ManagedSnapshotRuntime** inside the shipped OpenEnv server process owns the admitted snapshot population. It compiles a graph-friendly root snapshot from the manifest, normalizing trust-only principals into a canonical principal catalog, then derives child snapshots by applying explicit typed mutations to admitted parents. Parent selection is policy-driven over the admitted population rather than raw latest/random sampling. Each candidate child is validated in layers: manifest compliance, canonical graph checks, structural/task checks, and, in managed-generation mode, booted runtime checks before admission. `reset()` selects one frozen admitted snapshot. `step()` runs commands inside it.
```mermaid
flowchart LR
M[Manifest<br/>legal family +<br/>mutation envelope] --> B[Base snapshot compiler]
B --> P[Admitted root snapshot]
P --> R[ManagedSnapshotRuntime<br/>shared inside server process]
R --> U[Policy-guided parent selector +<br/>typed mutator]
U --> V{Validator<br/>manifest + graph +<br/>runtime checks}
V -->|fail| U
V -->|pass| S[Admitted snapshot population]
S --> E["reset() β step() β obs + reward"]
style V fill:#ffd93d,color:#333
style S fill:#6bcb77,color:#fff
```
Red and Blue operate on the **same infrastructure simultaneously**. Red's stealth reward depends on whether Blue catches them. Blue's detection reward depends on Red's actual actions in the logs. This coupling drives co-evolution.
## Quick Start
```bash
# Install
git clone https://github.com/open-cybernauts/open-range.git
cd open-range
uv sync
# Optional: enable the LiteLLM-backed builder pipeline
uv sync --extra builder
# Optional: enable LiteLLM-backed synthetic teacher agents
uv sync --extra synthetic
# Optional: enable background refill inside the server
export OPENRANGE_ENABLE_MANAGED_REFILL=1
export OPENRANGE_RUNTIME_BUILDER=llm
# End-to-end demo (no Docker, no LLM)
uv run python examples/demo.py
# Generate synthetic SFT traces from a snapshot or manifest
uv run openrange synthetic-data \
--manifest manifests/tier1_basic.yaml \
--output data/sft_red.jsonl \
--roles red
# Merge local bootstrap traces and tool context into generated output
uv run openrange synthetic-data \
--manifest manifests/tier1_basic.yaml \
--output data/synthetic_sft_5.jsonl \
--num-traces 5 \
--roles red \
--bootstrap-traces data/sft.jsonl \
--tool-info data/tool_info.md
# Run the OpenEnv client against a running server
uv run python examples/remote_client_demo.py --base-url http://localhost:8000
# Run the FastAPI server
uv run server # default: 127.0.0.1:8000
uv run server --port 9000 # custom port
uv run server --host 0.0.0.0 # bind all interfaces
# Or via uvicorn directly
uv run uvicorn server.app:app --host 0.0.0.0 --port 8000 --reload
# Tests
uv run pytest tests/ -v --tb=short
```
## Core Components
**Manifest** β YAML defining the legal world family and mutation envelope: hosts, zones, services, users, NPCs, data assets, credential policies, monitoring coverage, trust relationships, and which vulnerability classes the runtime may plant or extend. Three example manifests ship (healthcare, fintech, SaaS) at tiers 1-3.
**ManagedSnapshotRuntime** β Shared singleton created at server startup. Owns the `SnapshotStore`, base builder, population-aware parent selector, parent-snapshot mutator, validator gate, `SnapshotRenderer`, snapshot preload, optional background refill, and episode-result feedback. This is the hidden orchestrator behind the env; callers still only see `reset()`, `step()`, and `state()`.
**Builder / Mutator** β The base builder compiles an initial `SnapshotSpec` from a manifest. Root hydration then expands that into canonical topology state: host details, dependency edges, trust edges, and a principal catalog that can represent trust-only people without inventing login accounts. The mutator derives child `SnapshotSpec`s from admitted parents using typed mutation plans plus an explicit mutation-policy layer that scores parents and candidate edits with curriculum, replay, novelty, and lineage signals. Each snapshot carries lineage metadata (`snapshot_id`, `parent_snapshot_id`, `root_snapshot_id`, generation depth, mutation summary) and can emit constrained service/app payloads through `SnapshotSpec.files`. Three base builders ship: `LLMSnapshotBuilder` (production, via litellm), `TemplateOnlyBuilder` (deterministic shipped default), `FileBuilder` (load from disk).
The deployed package exposes the standard OpenEnv `reset()`, `step()`, and `state()` contract through `server.app:app`, which is the entrypoint referenced by `openenv.yaml`.
**Validator** β Admission gate for candidate snapshots. The shipped runtime enforces manifest compliance plus graph-native checks such as graph consistency, path solvability, evidence sufficiency, and reward grounding before structural/task checks. With the `training` profile, the runtime boots rendered bundles, applies payload files, constructs a real `ContainerSet`, and runs live build/exploit/patch/evidence/reward/isolation/difficulty/NPC/realism checks before admission.
Validator profile matrix:
| Profile | Checks | Guarantees |
|---------|--------|------------|
| `offline` | Graph + structural/task checks only (no live containers) | Fast static admission only; no live exploitability/patchability guarantee |
| `training` | `offline` checks + live/container-backed checks | Full admission guarantees for managed training/runtime use |
Managed runtime defaults and safety behavior:
- `OPENRANGE_RUNTIME_VALIDATOR_PROFILE` defaults to `training`.
- `OPENRANGE_ENABLE_LIVE_ADMISSION` defaults to `1`.
- If managed runtime is configured non-live (`offline` profile and/or live admission disabled), startup raises an error unless you explicitly opt out with `OPENRANGE_ALLOW_NON_LIVE_ADMISSION=1` (legacy alias: `OPENRANGE_ALLOW_OFFLINE_ADMISSION=1`), in which case a warning is emitted.
**Environment** β `RangeEnvironment(Environment)` following the OpenEnv contract. `reset()` asks the shared runtime for a frozen admitted snapshot. `step(action)` routes commands to the appropriate container β Red runs on the attacker box, Blue runs on the SIEM. No artificial command allowlists; the container's installed tools are the constraint.
**Rewards** β All grounded in container state, not LLM judgment:
| Red | Blue |
|-----|------|
| Flag capture (binary, `docker exec cat`) | Detection (TP rate vs Red's log) |
| Efficiency (`gamma^steps`) | Patch validity (re-run exploit, must fail) |
| Stealth (inversely coupled to Blue detection) | Availability (healthcheck fraction) |
| Anti-hallucination (-0.3 per fake flag) | False positive penalty (-0.2 per NPC flagged) |
**NPC Traffic** β Background noise and social engineering surface. Two levels:
- **Level 0** (shell scripts): `http_traffic.sh`, `db_traffic.sh`, `ssh_traffic.sh` generate benign traffic that Blue must filter from real attacks. Scripts discover targets dynamically (available pages, databases, tables) β no hardcoded endpoints.
- **Level 1** (LLM agents): Each NPC persona runs an autonomous workday via LiteLLM β browsing pages, sending emails, querying databases, accessing file shares. NPCs also react to incoming stimuli (phishing emails) based on their `security_awareness` profile.
All NPC actions are derived from the `SnapshotSpec` at runtime (pages, shares, tables, credentials, domain), so they generalize to any Builder-generated environment. NPC logs carry structured fields (`type`, `label`, `source`, `result`) that couple directly to Red/Blue reward signals.
Configure the NPC model via environment variable:
```bash
export OPENRANGE_NPC_MODEL="azure/gpt-5.2-codex" # or openai/gpt-4o, anthropic/claude-haiku-4-5-20251001, ollama/llama3
```
**Agents** β Structural protocol: any object with `reset(briefing, role)` and `act(observation) -> command` works. Ships with `LLMRangeAgent` (litellm, any provider), `ScriptedAgent`, and `HumanAgent`.
**Synthetic Data** β `open_range.training.synthetic` provides snapshot-grounded trajectory generation for SFT warm-start. It uses a fast simulated `RangeEnvironment`, optional LiteLLM teacher agents, per-episode flag randomization, and exports JSONL through `TrajectoryLogger`.
```python
from open_range.agents.episode import run_episode
from open_range.agents.llm_agent import LLMRangeAgent
from open_range.server.environment import RangeEnvironment
env = RangeEnvironment()
red = LLMRangeAgent(model="anthropic/claude-sonnet-4-20250514")
blue = LLMRangeAgent(model="openai/gpt-4o")
result = run_episode(env, red, blue, max_steps=50)
```
## Tier System
Difficulty grows horizontally β more hosts, zones, and chained attack surface. Not just harder passwords.
| Tier | Scale | Example |
|------|-------|---------|
| 1 | 6-8 hosts, 3-4 zones | Healthcare clinic: web + DB + mail + LDAP + SIEM |
| 2 | 10-12 hosts, 5-6 zones | Financial firm: + VPN, internal APIs, certificate authority |
| 3 | 14-18 hosts, 7-8 zones | SaaS company: + CI/CD, container registry, partner extranet |
## Server Endpoints
| Method | Path | Description |
|--------|------|-------------|
| GET | `/health` | Liveness check |
| GET | `/metadata` | Environment name, version |
| POST | `/reset` | Start episode, returns initial observation |
| POST | `/step` | Execute action, returns observation + reward |
| GET | `/state` | Current episode state |
| WS | `/ws` | WebSocket session |
Built directly on the OpenEnv HTTP/WebSocket contract.
## Docs
- [Architecture](docs/architecture.md) β full pipeline, network topology, episode lifecycle
- [Builder & Validator](docs/builder-validator.md) β snapshot generation and admission
- [Red & Blue Agents](docs/red-blue-agents.md) β tandem training, reward coupling, curriculum
- [Synthetic Data](docs/synthetic-data.md) β snapshot-backed SFT trace generation with LiteLLM teachers
- [Agent Protocols](docs/agent-protocols.md) β agent interface, episode runner, evaluation
- [OpenEnv Compliance](docs/openenv-compliance.md) β API contract, models, deployment
## Built On
- [OpenEnv](https://github.com/meta-pytorch/OpenEnv) β standardized agentic execution environments
- Ideas from [R2E-Gym](https://arxiv.org/abs/2504.07164) (hybrid verification), [Self-Play SWE-RL](https://arxiv.org/abs/2512.18552) (formal specs, inverse mutation), PAIRED/UED (constrained generation), POET (mutate + admit)
## License
Apache 2.0
|