File size: 3,855 Bytes
f44f429 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 | # OpenEnv Environment Specification
# This file describes the Code Security Review environment for the Meta PyTorch OpenEnv Hackathon.
# Metadata section details the environment's identity.
name: code-security-review
version: "1.0.0"
description: >
An RL environment for training AI agents to perform code security review.
Agents analyze code snippets from production pull requests and identify bugs,
vulnerabilities, and security issues.
author: Inmodel Labs
# Tasks section defines the core challenges in the environment.
# Each task has a unique ID, name, description, and difficulty level.
tasks:
- id: python-off-by-one
name: "Python Off-by-One Error"
description: "Identify an off-by-one index error in a Python finance batch processor"
difficulty: easy
max_steps: 2
reward_range: [0.0, 1.0]
- id: js-idor-auth
name: "JavaScript IDOR Authorization Bypass"
description: "Identify a horizontal privilege escalation (IDOR) in a Node.js REST profile endpoint"
difficulty: medium
max_steps: 2
reward_range: [0.0, 1.0]
- id: python-pickle-deserialization
name: "Python Pickle Deserialization"
description: "Identify an insecure deserialization vulnerability using pickle in a background worker"
difficulty: hard
max_steps: 2
reward_range: [0.0, 1.0]
# The Action space defines the format of the agent's response.
# Each field is scored by the grader to provide partial progress signals.
action_space:
type: object
description: >
Two-phase action space. Phase 1: submit {"request_file": true} to unlock
the code snippet (+0.20 reward). Phase 2: submit a full review JSON.
properties:
request_file: { type: boolean, description: "Phase 1: Request the hidden file contents" }
bug_identified: { type: boolean, description: "Boolean: true if a bug exists" }
bug_location: { type: string, description: "String: Pinpoint the bug's location in code" }
bug_type: { type: string, description: "String: off-by-one | logic-error | insecure-deserialization | none" }
bug_description: { type: string, description: "String: Detailed analysis of the vulnerability" }
severity: { type: string, enum: [none, low, medium, high, critical], description: "String: none | low | medium | high | critical" }
suggested_fix: { type: string, description: "String: How to fix the identified bug" }
# The Observation space defines what the agent sees at each step.
# It uses a structured context to help the agent understand the code's purpose.
observation_space:
type: object
properties:
task_id: { type: string, description: "Unique task identifier" }
language: { type: string, description: "Source code language" }
difficulty: { type: string, enum: [easy, medium, hard], description: "Task complexity (easy/medium/hard)" }
code_snippet: { type: string, description: "The source code to be reviewed" }
context: { type: string, description: "Real-world context (e.g., API description)" }
pr_title: { type: string, description: "Pull Request title for additional intent context" }
file_path: { type: string, description: "Relative path to the file in the repository" }
# Reward structure for evaluating agent performance.
reward:
min: 0.0
max: 1.0
description: >
Step 1 — File request: +0.20 (flat, always granted).
Step 2 — Bug review: partial rewards for bug identification (0.20),
correct bug type (0.20), precise location (0.10), description quality (0.25,
keyword density), fix quality (0.15), correct severity (0.10).
Episode total is clamped to [0.0, 1.0]. Grader penalizes keyword stuffing.
endpoints:
health: GET /
reset: POST /reset
step: POST /step
state: GET /state
tasks: GET /tasks
|