openenv-content-moderation / docs /ARCHITECTURE.md
ThejasRao's picture
Initial OpenENV hackathon submission
c492c3f
# πŸ“„ ARCHITECTURE.md β€” System Design & Component Architecture
---
## 🧠 Overview
This document defines the **system architecture** for the AI Community Moderation Environment.
The system is designed to:
* comply with **OpenEnv specification**
* support **multi-step agent interaction**
* provide **deterministic evaluation**
* be easily deployable via **FastAPI + Docker**
---
## πŸ—οΈ High-Level Architecture
### πŸ”Ή Updated β€” Agent layer added
```text
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Agent Layer β”‚
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Gemini 2.5 Flash β”‚ β”‚ Rule-based Baseline Agent β”‚ β”‚
β”‚ β”‚ (agent/gemini_ β”‚ β”‚ (baseline/agent.py) β”‚ β”‚
β”‚ β”‚ agent.py) β”‚ β”‚ β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ /reset /step /agent/run β”‚ /baseline β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚
β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ FastAPI API (api/app.py) β”‚
β”‚ /reset /step /state /tasks /grader /baseline /agent/run β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Environment Core (env/) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”
β”‚ State Manager β”‚ β”‚ Reward Engine β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Policy Engine β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Data Generator β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Grader Engine β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
---
## πŸ“¦ Core Modules
---
## 1. 🧠 Environment Core (`env/`)
### Responsibility
Implements OpenEnv interface:
* `reset()`
* `step(action)`
* `state()`
### Key File
```python id="f36n7z"
env/moderation_env.py
```
### Responsibilities
* orchestrates entire flow
* maintains episode lifecycle
* integrates all sub-components
---
## 2. πŸ“Š State Manager (`env/state_manager.py`)
### Responsibility
Handles:
* current state representation
* updates after each action
### State Includes
* post content
* user history
* reports
* geo
* context
---
## 3. 🧾 Policy Engine (`env/policy_engine.py`)
### Responsibility
* implements rules from `POLICY.md`
* computes:
* violation type
* severity
* expected action
### Key Function
```python id="0w8e3u"
def evaluate_policy(state) -> dict:
return {
"violation_type": ...,
"severity": ...,
"expected_action": ...
}
```
---
## 4. 🧬 Data Generator (`env/data_generator.py`)
### Responsibility
* generates synthetic moderation scenarios
* ensures deterministic outputs
### Features
* template-based post generation
* context simulation (history, reports, geo)
* seed-controlled reproducibility
---
## 5. πŸ† Reward Engine (`env/reward_engine.py`)
### Responsibility
* computes step-wise rewards
* aligns with `REWARD.md`
### Input
* current state
* action
* ground truth
---
## 6. πŸ§ͺ Grader Engine (`graders/`)
### Responsibility
* computes final episode score
* aligns with `GRADERS.md`
### Key File
```python id="4bqj7f"
graders/grader.py
```
---
## 7. 🌐 API Layer (`api/`)
### Framework
* FastAPI
### Responsibilities
Expose endpoints:
| Endpoint | Function |
| ------------- | -------------------------------- |
| `/reset` | start new episode |
| `/step` | take action |
| `/state` | current state |
| `/tasks` | list tasks |
| `/grader` | compute score |
| `/baseline` | run rule-based baseline agent |
| `/agent/run` | run Gemini 2.5 Flash agent πŸ”Ή |
| `/health` | liveness check |
---
## 8. πŸ€– Baseline Agent (`baseline/`)
### Responsibility
* runs simple rule-based heuristic agent
* produces reproducible benchmark without any LLM dependency
---
## 9. 🧠 Gemini Agent (`agent/`) β€” πŸ”Ή Added
### Responsibility
* LLM-driven agent using **Gemini 2.5 Flash** (google-genai SDK)
* interacts with the environment via the same `/reset` and `/step` API
* uses multi-turn chat with a structured system prompt
* parses JSON action responses; falls back to `escalate` on parse failure
### Key Files
```
agent/gemini_agent.py β€” GeminiAgent class
agent/prompts.py β€” SYSTEM_PROMPT + build_turn_prompt()
```
### Design Constraint
The LLM is **only the decision-making layer**. Policy evaluation, reward
computation, and grading remain fully deterministic in the environment.
---
## πŸ” Data Flow
### 1. Reset
```text id="5v3sbh"
API β†’ Environment.reset()
β†’ DataGenerator
β†’ StateManager
β†’ Initial State returned
```
---
### 2. Step
```text id="4cx7qf"
Agent Action β†’ API
β†’ Environment.step()
β†’ StateManager update
β†’ PolicyEngine evaluate
β†’ RewardEngine compute
β†’ New State returned
```
---
### 3. Grading
```text id="yy9gsl"
Episode complete β†’ GraderEngine
β†’ Score computed
β†’ Returned via API
```
---
## 🧠 Internal Interaction Flow
```text id="4l9p6p"
Action
↓
State Update
↓
Policy Evaluation
↓
Reward Calculation
↓
Next State
```
---
## 🧩 Component Dependencies
| Component | Depends On |
| ------------- | --------------------- |
| Environment | State, Policy, Reward |
| State Manager | Data Generator |
| Reward Engine | Policy Engine |
| Grader | Policy + Trajectory |
| API | Environment |
---
## βš™οΈ Execution Model
* single-threaded environment (MVP)
* stateless API with session tracking (in-memory)
* reproducible seeds for all scenarios
---
## 🐳 Deployment Architecture
```text id="0shdn3"
Docker Container
β”œβ”€β”€ FastAPI Server
β”œβ”€β”€ Environment Core
β”œβ”€β”€ Graders
└── Baseline Agent
```
Runs on:
* Hugging Face Spaces
* local Docker
---
## 🧠 Design Principles
### 1. Modularity
Each component is isolated and testable
---
### 2. Determinism
All outputs reproducible
---
### 3. Simplicity
Avoid unnecessary abstraction
---
### 4. Spec Compliance
Strict adherence to OpenEnv interface
---
## 🧠 One-Line Summary
> A modular, deterministic system where an environment core orchestrates policy evaluation, reward computation, and grading through well-defined components exposed via a FastAPI interface.
---