File size: 3,584 Bytes
df47251
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
# System Architecture

## Overview

WebScraper-OpenEnv is designed as a modular, dashboard-first RL environment with extensible APIs, MCP tools, and multi-model routing.

## High-Level Topology

```text
Frontend Dashboard (React/Vite)
        |
        v
FastAPI Control Plane
  - episode lifecycle
  - action dispatch
  - reward engine
  - tool registry API
  - settings + policy
        |
        +--> Agent Runtime
        |      - planner/navigator/extractor/verifier
        |      - memory manager
        |      - model router
        |
        +--> MCP Gateway
        |      - tool discovery
        |      - lazy install/load
        |      - schema + timeout + retries
        |
        +--> Search Layer
        |      - provider routing
        |      - query optimization
        |      - credibility scoring
        |
        +--> Memory Layer
        |      - short/working/long/shared
        |      - vector index + persistent storage
        |
        +--> Observability
               - traces/logs/metrics/cost dashboard
```

## Core Subsystems

### 1. Control Plane

Responsibilities:

- reset/step/state APIs
- request validation
- action authorization and policy checks
- deterministic episode management

### 2. Agent Runtime

Responsibilities:

- policy inference
- strategy execution
- fallback handling
- action explainability

### 3. Tooling Plane (MCP)

Responsibilities:

- dynamic tool registry
- server health checks
- lazy installation
- composition workflows

### 4. Data Plane

Responsibilities:

- HTML ingestion and chunking
- extraction and normalization
- verification and reconciliation
- output persistence

### 5. Analytics Plane

Responsibilities:

- reward component logging
- model/token/cost accounting
- tool usage telemetry
- memory quality analytics

## Processing Pipeline

1. `reset(task_id, seed)`
2. observation emitted
3. policy selects action
4. action executes (native/MCP/search/memory)
5. reward computed and logged
6. done check
7. repeat until terminal

## Batch and Parallel Design

### Batch

- large HTML split into semantic chunks
- chunk extraction batched with bounded size
- merge + dedupe + confidence rank

### Parallel

- independent chunk tasks run concurrently
- search and verification can run in parallel branches
- configurable worker limits and queue priorities

## Queue and Scheduler

Task queue supports:

- priority classes (`high`, `normal`, `low`)
- cancellation tokens
- retry policy with backoff
- dead-letter queue for repeated failures

## Storage Architecture

- Episode state: in-memory + optional persistence
- Long-term memory: vector DB + metadata store
- Logs/metrics: append-only time-series-friendly sink
- Exports: JSON/CSV trace packs

## Reliability

- per-tool timeout and retry
- per-step safety budget
- circuit breaker for failing providers
- deterministic fallback chains

## Security

- API key vaulting via env/config secrets
- MCP allowlist
- output sanitization
- redaction of sensitive tokens in logs

## Deployment

Single-container baseline:

- frontend static build served by API backend
- optional sidecars for DB/vector/MCP infra

Scale-out profile:

- separate API and worker pools
- managed vector DB
- queue-backed distributed execution
- central observability backend

## Compatibility Goals

- local dev mode with minimal dependencies
- cloud mode with managed infra
- optional self-hosted LLM endpoints

## Future Architecture Extensions

- distributed multi-agent graph execution
- adaptive autoscaling by queue pressure
- global memory federation across projects