Spaces:
Sleeping
Sleeping
File size: 3,584 Bytes
df47251 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 | # System Architecture
## Overview
WebScraper-OpenEnv is designed as a modular, dashboard-first RL environment with extensible APIs, MCP tools, and multi-model routing.
## High-Level Topology
```text
Frontend Dashboard (React/Vite)
|
v
FastAPI Control Plane
- episode lifecycle
- action dispatch
- reward engine
- tool registry API
- settings + policy
|
+--> Agent Runtime
| - planner/navigator/extractor/verifier
| - memory manager
| - model router
|
+--> MCP Gateway
| - tool discovery
| - lazy install/load
| - schema + timeout + retries
|
+--> Search Layer
| - provider routing
| - query optimization
| - credibility scoring
|
+--> Memory Layer
| - short/working/long/shared
| - vector index + persistent storage
|
+--> Observability
- traces/logs/metrics/cost dashboard
```
## Core Subsystems
### 1. Control Plane
Responsibilities:
- reset/step/state APIs
- request validation
- action authorization and policy checks
- deterministic episode management
### 2. Agent Runtime
Responsibilities:
- policy inference
- strategy execution
- fallback handling
- action explainability
### 3. Tooling Plane (MCP)
Responsibilities:
- dynamic tool registry
- server health checks
- lazy installation
- composition workflows
### 4. Data Plane
Responsibilities:
- HTML ingestion and chunking
- extraction and normalization
- verification and reconciliation
- output persistence
### 5. Analytics Plane
Responsibilities:
- reward component logging
- model/token/cost accounting
- tool usage telemetry
- memory quality analytics
## Processing Pipeline
1. `reset(task_id, seed)`
2. observation emitted
3. policy selects action
4. action executes (native/MCP/search/memory)
5. reward computed and logged
6. done check
7. repeat until terminal
## Batch and Parallel Design
### Batch
- large HTML split into semantic chunks
- chunk extraction batched with bounded size
- merge + dedupe + confidence rank
### Parallel
- independent chunk tasks run concurrently
- search and verification can run in parallel branches
- configurable worker limits and queue priorities
## Queue and Scheduler
Task queue supports:
- priority classes (`high`, `normal`, `low`)
- cancellation tokens
- retry policy with backoff
- dead-letter queue for repeated failures
## Storage Architecture
- Episode state: in-memory + optional persistence
- Long-term memory: vector DB + metadata store
- Logs/metrics: append-only time-series-friendly sink
- Exports: JSON/CSV trace packs
## Reliability
- per-tool timeout and retry
- per-step safety budget
- circuit breaker for failing providers
- deterministic fallback chains
## Security
- API key vaulting via env/config secrets
- MCP allowlist
- output sanitization
- redaction of sensitive tokens in logs
## Deployment
Single-container baseline:
- frontend static build served by API backend
- optional sidecars for DB/vector/MCP infra
Scale-out profile:
- separate API and worker pools
- managed vector DB
- queue-backed distributed execution
- central observability backend
## Compatibility Goals
- local dev mode with minimal dependencies
- cloud mode with managed infra
- optional self-hosted LLM endpoints
## Future Architecture Extensions
- distributed multi-agent graph execution
- adaptive autoscaling by queue pressure
- global memory federation across projects
|