# ๐Ÿ“„ TECH_STACK.md โ€” Technology Choices & Justification --- ## ๐Ÿง  Overview This document defines the **technology stack** used to build the AI Community Moderation Environment. The stack is chosen to: * ensure **OpenEnv compliance** * enable **fast development** * support **deterministic execution** * allow seamless **Docker + Hugging Face deployment** --- ## ๐Ÿงฑ Core Language ### ๐Ÿ Python 3.10+ **Why:** * native ecosystem for OpenEnv * strong support for RL + APIs * rapid prototyping and debugging --- ## ๐ŸŒ Backend Framework ### โšก FastAPI **Why:** * lightweight and high-performance * built-in request validation (via Pydantic) * async support * ideal for exposing `/reset`, `/step`, `/grader`, etc. * easily deployable on Hugging Face Spaces --- ## ๐Ÿงพ Data Validation & Models ### ๐Ÿ“ฆ Pydantic **Why:** * strict typing for OpenEnv compliance * automatic validation of API inputs/outputs * seamless integration with FastAPI * ensures deterministic schema handling --- ## ๐Ÿง  Environment Logic ### ๐Ÿงฉ Custom Python Modules * `env/` โ†’ environment core * `policy_engine.py` โ†’ rule evaluation * `reward_engine.py` โ†’ step rewards * `data_generator.py` โ†’ synthetic data **Why:** * full control over logic * deterministic execution * no external dependencies --- ## ๐Ÿค– Baseline Agent ### ๐Ÿชถ Rule-based Agent (`baseline/agent.py`) Heuristic agent using keyword matching โ€” no LLM, fully reproducible. --- ## ๐Ÿง  LLM Agent โ€” ๐Ÿ”น Added ### Gemini 2.5 Flash via `google-genai` SDK **Why:** * state-of-the-art reasoning on structured tasks * multi-turn chat with system prompt support * deterministic temperature=0.0 setting minimises variance * `google-genai` SDK is the latest official Google AI Python SDK **Key files:** ``` agent/gemini_agent.py โ€” GeminiAgent class, multi-turn loop agent/prompts.py โ€” system prompt + per-turn prompt builder ``` **SDK:** ```python from google import genai client = genai.Client(api_key=GOOGLE_API_KEY) chat = client.chats.create(model="gemini-2.5-flash", config=...) response = chat.send_message(turn_prompt) ``` **Design constraint:** LLM is ONLY the decision layer. Policy, reward, and grading remain deterministic in `env/`. --- ## ๐Ÿ“ฆ API Communication ### JSON over HTTP **Why:** * simple and standard * compatible with OpenEnv expectations * easy debugging and logging --- ## ๐Ÿณ Containerization ### ๐Ÿณ Docker **Why:** * mandatory for evaluation pipeline * ensures consistent runtime environment * required for Hugging Face Spaces deployment --- ## โ˜๏ธ Deployment Platform ### ๐Ÿค— Hugging Face Spaces (Docker-based) **Why:** * required by competition * easy container deployment * public endpoint for validation * integrates well with OpenEnv ecosystem --- ## ๐Ÿงช Testing & Validation ### โœ… Pytest (optional but recommended) **Why:** * quick validation of: * policy engine * reward logic * graders --- ### ๐Ÿ” OpenEnv Validator **Why:** * ensures compliance with: * API schema * openenv.yaml * required endpoints --- ## ๐Ÿ“Š Logging (Optional) ### Python Logging **Why:** * debug environment flow * trace agent decisions * inspect reward calculations --- ## ๐Ÿ“ฆ Dependency Summary ```txt fastapi uvicorn[standard] pydantic>=2.6 google-genai>=1.0 openai # optional / legacy pytest httpx # TestClient ``` --- ## โš™๏ธ Runtime Setup | Component | Tool | | ---------- | --------- | | API Server | Uvicorn | | Backend | FastAPI | | Models | Pydantic | | Container | Docker | | Deployment | HF Spaces | --- ## ๐Ÿง  Design Principles ### 1. Minimalism Only essential tools used --- ### 2. Determinism No external APIs for core logic --- ### 3. Reproducibility Same input โ†’ same output --- ### 4. Fast Iteration Simple stack = quick debugging --- ## โš ๏ธ What We Avoid (Important) * โŒ heavy ML frameworks (TensorFlow, PyTorch) * โŒ databases (not needed for MVP) * โŒ microservices complexity * โŒ external APIs for environment logic --- ## ๐Ÿง  One-Line Summary > A lightweight Python-based stack using FastAPI and Pydantic to build a deterministic, OpenEnv-compliant environment with Docker-based deployment. ---