| name: code-review-agent-env |
| version: 1.0.0 |
| description: | |
| A realistic code review environment where AI agents review pull requests, |
| identify issues, suggest improvements, and make approval decisions. |
| Models real-world software development workflows. |
| |
| authors: |
| - Ashish <ashishkbaberwal@gmail.com> |
| - Shardul <shardulmd@gmail.com> |
| - Harshit <shakyanitin807@gmail.com> |
| tags: |
| - code-review |
| - software-engineering |
| - agent-evaluation |
| - real-world-task |
|
|
| license: MIT |
|
|
| environment: |
| class: environment.env.CodeReviewEnv |
| entry_point: environment.env:CodeReviewEnv |
|
|
| tasks: |
| - id: bug_detection_easy_1 |
| name: "Easy: Detect Division by Zero" |
| difficulty: easy |
|
|
| - id: bug_detection_easy_2 |
| name: "Easy: Off-by-One Error" |
| difficulty: easy |
|
|
| - id: memory_leak_medium_1 |
| name: "Medium: Memory Leak Detection" |
| difficulty: medium |
|
|
| - id: performance_medium_2 |
| name: "Medium: String Concatenation Performance" |
| difficulty: medium |
|
|
| - id: security_hard_1 |
| name: "Hard: SQL Injection Vulnerability" |
| difficulty: hard |
|
|
| - id: race_condition_hard_2 |
| name: "Hard: Race Condition" |
| difficulty: hard |
|
|
| observation_space: |
| type: dict |
| description: | |
| Contains code diff, file context, and review status |
| |
| action_space: |
| type: dict |
| description: | |
| Actions include adding comments, suggesting fixes, approving, or requesting changes |
| |
| reward_range: |
| min: -0.5 |
| max: 1.0 |
|
|
| max_episode_steps: 50 |
|
|
| requires_api_keys: |
| - API_KEY |
| - HF_TOKEN |
| - OPENAI_API_KEY |
| - API_BASE_URL |
| - MODEL_NAME |