| # π TECH_STACK.md β Technology Choices & Justification |
| |
| --- |
| |
| ## π§ Overview |
| |
| This document defines the **technology stack** used to build the AI Community Moderation Environment. |
| |
| The stack is chosen to: |
| |
| * ensure **OpenEnv compliance** |
| * enable **fast development** |
| * support **deterministic execution** |
| * allow seamless **Docker + Hugging Face deployment** |
| |
| --- |
| |
| ## π§± Core Language |
| |
| ### π Python 3.10+ |
| |
| **Why:** |
| |
| * native ecosystem for OpenEnv |
| * strong support for RL + APIs |
| * rapid prototyping and debugging |
| |
| --- |
| |
| ## π Backend Framework |
| |
| ### β‘ FastAPI |
| |
| **Why:** |
| |
| * lightweight and high-performance |
| * built-in request validation (via Pydantic) |
| * async support |
| * ideal for exposing `/reset`, `/step`, `/grader`, etc. |
| * easily deployable on Hugging Face Spaces |
| |
| --- |
| |
| ## π§Ύ Data Validation & Models |
| |
| ### π¦ Pydantic |
| |
| **Why:** |
| |
| * strict typing for OpenEnv compliance |
| * automatic validation of API inputs/outputs |
| * seamless integration with FastAPI |
| * ensures deterministic schema handling |
| |
| --- |
| |
| ## π§ Environment Logic |
| |
| ### π§© Custom Python Modules |
| |
| * `env/` β environment core |
| * `policy_engine.py` β rule evaluation |
| * `reward_engine.py` β step rewards |
| * `data_generator.py` β synthetic data |
|
|
| **Why:** |
|
|
| * full control over logic |
| * deterministic execution |
| * no external dependencies |
|
|
| --- |
|
|
| ## π€ Baseline Agent |
|
|
| ### πͺΆ Rule-based Agent (`baseline/agent.py`) |
|
|
| Heuristic agent using keyword matching β no LLM, fully reproducible. |
|
|
| --- |
|
|
| ## π§ LLM Agent β πΉ Added |
|
|
| ### Gemini 2.5 Flash via `google-genai` SDK |
|
|
| **Why:** |
|
|
| * state-of-the-art reasoning on structured tasks |
| * multi-turn chat with system prompt support |
| * deterministic temperature=0.0 setting minimises variance |
| * `google-genai` SDK is the latest official Google AI Python SDK |
|
|
| **Key files:** |
|
|
| ``` |
| agent/gemini_agent.py β GeminiAgent class, multi-turn loop |
| agent/prompts.py β system prompt + per-turn prompt builder |
| ``` |
|
|
| **SDK:** |
|
|
| ```python |
| from google import genai |
| client = genai.Client(api_key=GOOGLE_API_KEY) |
| chat = client.chats.create(model="gemini-2.5-flash", config=...) |
| response = chat.send_message(turn_prompt) |
| ``` |
|
|
| **Design constraint:** LLM is ONLY the decision layer. Policy, reward, and grading remain deterministic in `env/`. |
|
|
| --- |
|
|
| ## π¦ API Communication |
|
|
| ### JSON over HTTP |
|
|
| **Why:** |
|
|
| * simple and standard |
| * compatible with OpenEnv expectations |
| * easy debugging and logging |
|
|
| --- |
|
|
| ## π³ Containerization |
|
|
| ### π³ Docker |
|
|
| **Why:** |
|
|
| * mandatory for evaluation pipeline |
| * ensures consistent runtime environment |
| * required for Hugging Face Spaces deployment |
|
|
| --- |
|
|
| ## βοΈ Deployment Platform |
|
|
| ### π€ Hugging Face Spaces (Docker-based) |
|
|
| **Why:** |
|
|
| * required by competition |
| * easy container deployment |
| * public endpoint for validation |
| * integrates well with OpenEnv ecosystem |
|
|
| --- |
|
|
| ## π§ͺ Testing & Validation |
|
|
| ### β
Pytest (optional but recommended) |
|
|
| **Why:** |
|
|
| * quick validation of: |
|
|
| * policy engine |
| * reward logic |
| * graders |
|
|
| --- |
|
|
| ### π OpenEnv Validator |
|
|
| **Why:** |
|
|
| * ensures compliance with: |
|
|
| * API schema |
| * openenv.yaml |
| * required endpoints |
|
|
| --- |
|
|
| ## π Logging (Optional) |
|
|
| ### Python Logging |
|
|
| **Why:** |
|
|
| * debug environment flow |
| * trace agent decisions |
| * inspect reward calculations |
|
|
| --- |
|
|
| ## π¦ Dependency Summary |
|
|
| ```txt |
| fastapi |
| uvicorn[standard] |
| pydantic>=2.6 |
| google-genai>=1.0 |
| openai # optional / legacy |
| pytest |
| httpx # TestClient |
| ``` |
|
|
| --- |
|
|
| ## βοΈ Runtime Setup |
|
|
| | Component | Tool | |
| | ---------- | --------- | |
| | API Server | Uvicorn | |
| | Backend | FastAPI | |
| | Models | Pydantic | |
| | Container | Docker | |
| | Deployment | HF Spaces | |
|
|
| --- |
|
|
| ## π§ Design Principles |
|
|
| ### 1. Minimalism |
|
|
| Only essential tools used |
|
|
| --- |
|
|
| ### 2. Determinism |
|
|
| No external APIs for core logic |
|
|
| --- |
|
|
| ### 3. Reproducibility |
|
|
| Same input β same output |
|
|
| --- |
|
|
| ### 4. Fast Iteration |
|
|
| Simple stack = quick debugging |
|
|
| --- |
|
|
| ## β οΈ What We Avoid (Important) |
|
|
| * β heavy ML frameworks (TensorFlow, PyTorch) |
| * β databases (not needed for MVP) |
| * β microservices complexity |
| * β external APIs for environment logic |
|
|
| --- |
|
|
| ## π§ One-Line Summary |
|
|
| > A lightweight Python-based stack using FastAPI and Pydantic to build a deterministic, OpenEnv-compliant environment with Docker-based deployment. |
|
|
| --- |
|
|