ThejasRao's picture
Initial OpenENV hackathon submission
c492c3f
# πŸ“„ TECH_STACK.md β€” Technology Choices & Justification
---
## 🧠 Overview
This document defines the **technology stack** used to build the AI Community Moderation Environment.
The stack is chosen to:
* ensure **OpenEnv compliance**
* enable **fast development**
* support **deterministic execution**
* allow seamless **Docker + Hugging Face deployment**
---
## 🧱 Core Language
### 🐍 Python 3.10+
**Why:**
* native ecosystem for OpenEnv
* strong support for RL + APIs
* rapid prototyping and debugging
---
## 🌐 Backend Framework
### ⚑ FastAPI
**Why:**
* lightweight and high-performance
* built-in request validation (via Pydantic)
* async support
* ideal for exposing `/reset`, `/step`, `/grader`, etc.
* easily deployable on Hugging Face Spaces
---
## 🧾 Data Validation & Models
### πŸ“¦ Pydantic
**Why:**
* strict typing for OpenEnv compliance
* automatic validation of API inputs/outputs
* seamless integration with FastAPI
* ensures deterministic schema handling
---
## 🧠 Environment Logic
### 🧩 Custom Python Modules
* `env/` β†’ environment core
* `policy_engine.py` β†’ rule evaluation
* `reward_engine.py` β†’ step rewards
* `data_generator.py` β†’ synthetic data
**Why:**
* full control over logic
* deterministic execution
* no external dependencies
---
## πŸ€– Baseline Agent
### πŸͺΆ Rule-based Agent (`baseline/agent.py`)
Heuristic agent using keyword matching β€” no LLM, fully reproducible.
---
## 🧠 LLM Agent β€” πŸ”Ή Added
### Gemini 2.5 Flash via `google-genai` SDK
**Why:**
* state-of-the-art reasoning on structured tasks
* multi-turn chat with system prompt support
* deterministic temperature=0.0 setting minimises variance
* `google-genai` SDK is the latest official Google AI Python SDK
**Key files:**
```
agent/gemini_agent.py β€” GeminiAgent class, multi-turn loop
agent/prompts.py β€” system prompt + per-turn prompt builder
```
**SDK:**
```python
from google import genai
client = genai.Client(api_key=GOOGLE_API_KEY)
chat = client.chats.create(model="gemini-2.5-flash", config=...)
response = chat.send_message(turn_prompt)
```
**Design constraint:** LLM is ONLY the decision layer. Policy, reward, and grading remain deterministic in `env/`.
---
## πŸ“¦ API Communication
### JSON over HTTP
**Why:**
* simple and standard
* compatible with OpenEnv expectations
* easy debugging and logging
---
## 🐳 Containerization
### 🐳 Docker
**Why:**
* mandatory for evaluation pipeline
* ensures consistent runtime environment
* required for Hugging Face Spaces deployment
---
## ☁️ Deployment Platform
### πŸ€— Hugging Face Spaces (Docker-based)
**Why:**
* required by competition
* easy container deployment
* public endpoint for validation
* integrates well with OpenEnv ecosystem
---
## πŸ§ͺ Testing & Validation
### βœ… Pytest (optional but recommended)
**Why:**
* quick validation of:
* policy engine
* reward logic
* graders
---
### πŸ” OpenEnv Validator
**Why:**
* ensures compliance with:
* API schema
* openenv.yaml
* required endpoints
---
## πŸ“Š Logging (Optional)
### Python Logging
**Why:**
* debug environment flow
* trace agent decisions
* inspect reward calculations
---
## πŸ“¦ Dependency Summary
```txt
fastapi
uvicorn[standard]
pydantic>=2.6
google-genai>=1.0
openai # optional / legacy
pytest
httpx # TestClient
```
---
## βš™οΈ Runtime Setup
| Component | Tool |
| ---------- | --------- |
| API Server | Uvicorn |
| Backend | FastAPI |
| Models | Pydantic |
| Container | Docker |
| Deployment | HF Spaces |
---
## 🧠 Design Principles
### 1. Minimalism
Only essential tools used
---
### 2. Determinism
No external APIs for core logic
---
### 3. Reproducibility
Same input β†’ same output
---
### 4. Fast Iteration
Simple stack = quick debugging
---
## ⚠️ What We Avoid (Important)
* ❌ heavy ML frameworks (TensorFlow, PyTorch)
* ❌ databases (not needed for MVP)
* ❌ microservices complexity
* ❌ external APIs for environment logic
---
## 🧠 One-Line Summary
> A lightweight Python-based stack using FastAPI and Pydantic to build a deterministic, OpenEnv-compliant environment with Docker-based deployment.
---