π TECH_STACK.md β Technology Choices & Justification
π§ Overview
This document defines the technology stack used to build the AI Community Moderation Environment.
The stack is chosen to:
- ensure OpenEnv compliance
- enable fast development
- support deterministic execution
- allow seamless Docker + Hugging Face deployment
π§± Core Language
π Python 3.10+
Why:
- native ecosystem for OpenEnv
- strong support for RL + APIs
- rapid prototyping and debugging
π Backend Framework
β‘ FastAPI
Why:
- lightweight and high-performance
- built-in request validation (via Pydantic)
- async support
- ideal for exposing
/reset,/step,/grader, etc. - easily deployable on Hugging Face Spaces
π§Ύ Data Validation & Models
π¦ Pydantic
Why:
- strict typing for OpenEnv compliance
- automatic validation of API inputs/outputs
- seamless integration with FastAPI
- ensures deterministic schema handling
π§ Environment Logic
π§© Custom Python Modules
env/β environment corepolicy_engine.pyβ rule evaluationreward_engine.pyβ step rewardsdata_generator.pyβ synthetic data
Why:
- full control over logic
- deterministic execution
- no external dependencies
π€ Baseline Agent
πͺΆ Rule-based Agent (baseline/agent.py)
Heuristic agent using keyword matching β no LLM, fully reproducible.
π§ LLM Agent β πΉ Added
Gemini 2.5 Flash via google-genai SDK
Why:
- state-of-the-art reasoning on structured tasks
- multi-turn chat with system prompt support
- deterministic temperature=0.0 setting minimises variance
google-genaiSDK is the latest official Google AI Python SDK
Key files:
agent/gemini_agent.py β GeminiAgent class, multi-turn loop
agent/prompts.py β system prompt + per-turn prompt builder
SDK:
from google import genai
client = genai.Client(api_key=GOOGLE_API_KEY)
chat = client.chats.create(model="gemini-2.5-flash", config=...)
response = chat.send_message(turn_prompt)
Design constraint: LLM is ONLY the decision layer. Policy, reward, and grading remain deterministic in env/.
π¦ API Communication
JSON over HTTP
Why:
- simple and standard
- compatible with OpenEnv expectations
- easy debugging and logging
π³ Containerization
π³ Docker
Why:
- mandatory for evaluation pipeline
- ensures consistent runtime environment
- required for Hugging Face Spaces deployment
βοΈ Deployment Platform
π€ Hugging Face Spaces (Docker-based)
Why:
- required by competition
- easy container deployment
- public endpoint for validation
- integrates well with OpenEnv ecosystem
π§ͺ Testing & Validation
β Pytest (optional but recommended)
Why:
quick validation of:
- policy engine
- reward logic
- graders
π OpenEnv Validator
Why:
ensures compliance with:
- API schema
- openenv.yaml
- required endpoints
π Logging (Optional)
Python Logging
Why:
- debug environment flow
- trace agent decisions
- inspect reward calculations
π¦ Dependency Summary
fastapi
uvicorn[standard]
pydantic>=2.6
google-genai>=1.0
openai # optional / legacy
pytest
httpx # TestClient
βοΈ Runtime Setup
| Component | Tool |
|---|---|
| API Server | Uvicorn |
| Backend | FastAPI |
| Models | Pydantic |
| Container | Docker |
| Deployment | HF Spaces |
π§ Design Principles
1. Minimalism
Only essential tools used
2. Determinism
No external APIs for core logic
3. Reproducibility
Same input β same output
4. Fast Iteration
Simple stack = quick debugging
β οΈ What We Avoid (Important)
- β heavy ML frameworks (TensorFlow, PyTorch)
- β databases (not needed for MVP)
- β microservices complexity
- β external APIs for environment logic
π§ One-Line Summary
A lightweight Python-based stack using FastAPI and Pydantic to build a deterministic, OpenEnv-compliant environment with Docker-based deployment.