# TASKS.md — Environment Task Definition > Senior OpenEnv Engineer Rule: **Real-world task. No toys. No games.** --- ## 🎯 Chosen Environment: Code Review Environment **Name:** `code_review_env` **Domain:** Software Engineering / Developer Tooling **Task:** The agent acts as a code reviewer. Given a code diff or snippet, the agent must produce a structured, high-quality review identifying bugs, style issues, and improvement suggestions. --- ## Why This Is Real-World ✅ - Code review is a high-value, daily engineering task - Clear, measurable correctness signals (bug found / not found, severity match) - Rich feedback loop: agent learns what good reviews look like - Direct production utility — can be deployed in CI/CD pipelines --- ## Episode Structure ``` reset() │ └── Agent receives: code snippet + task context (e.g., language, PR description, critical path flag) step(action) │ └── Agent sends: structured review (issues: List[Issue], summary: str, severity: Severity) └── Environment returns: reward (float), feedback (str), done (bool) ``` --- ## Action Space ```python @dataclass class CodeReviewAction(Action): issues: List[str] # List of identified issues summary: str # Overall review summary severity: str # "low" | "medium" | "high" | "critical" metadata: Dict[str, Any] # Optional extra context ``` --- ## Observation Space ```python @dataclass class CodeReviewObservation(Observation): done: bool reward: float code_snippet: str # Code to review (current step) language: str # e.g., "python", "javascript" context: str # PR description or task context ground_truth_issues: List[str] # Hidden during training rollout feedback: str # Human-readable feedback on last action step_number: int ``` --- ## State ```python @dataclass class CodeReviewState(State): episode_id: Optional[str] step_count: int total_snippets: int # How many snippets in this episode cumulative_reward: float language: str ``` --- ## Episode Flow | Step | Agent Receives | Agent Sends | Env Returns | |------|---------------|-------------|-------------| | 1 | Code snippet #1 + context | Structured review | Reward + feedback | | 2 | Code snippet #2 (harder) | Structured review | Reward + feedback | | … | … | … | … | | N | Final snippet | Final review | Terminal reward, done=True | --- ## Data Sources - [CodeSearchNet](https://github.com/github/CodeSearchNet) — multi-language code samples - Synthetic bug injection (off-by-one, null dereference, SQL injection, etc.) - Human-curated review gold standards (severity labels) --- ## Difficulty Levels | Level | Description | |-------|-------------| | Easy | Obvious syntax error or unused variable | | Medium | Logic bug, missing edge case handling | | Hard | Security vulnerability, concurrency issue | | Critical | Data corruption / memory leak pattern |