Nitish commited on
Commit
4a2e8a2
·
1 Parent(s): 42c4990

docs: final README and openenv.yaml sync with new tasks and baselines

Browse files
Files changed (2) hide show
  1. README.md +8 -8
  2. openenv.yaml +1 -1
README.md CHANGED
@@ -36,8 +36,8 @@ Built by **Inmodel Labs** for the Meta PyTorch OpenEnv Hackathon.
36
  | ID | Language | Bug Class | Difficulty |
37
  |---|---|---|---|
38
  | `python-off-by-one` | Python | Off-by-one index error | Easy |
39
- | `js-auth-privilege` | JavaScript | Logic flaw privilege escalation | Medium |
40
- | `python-pickle-deserialization` | Python | Insecure deserialization (RCE) | Hard |
41
 
42
  ---
43
 
@@ -79,7 +79,7 @@ The agent analyses the code and submits a structured JSON finding:
79
  |---|---|---|
80
  | `bug_identified` | bool | `true` / `false` |
81
  | `bug_location` | string | location description |
82
- | `bug_type` | string | `off-by-one` \| `logic-error` \| `security-vulnerability` \| `none` |
83
  | `bug_description` | string | detailed vulnerability explanation |
84
  | `severity` | string | `none` \| `low` \| `medium` \| `high` \| `critical` |
85
  | `suggested_fix` | string | how to fix the bug |
@@ -92,9 +92,9 @@ The agent analyses the code and submits a structured JSON finding:
92
  "language": "Python",
93
  "difficulty": "hard",
94
  "code_snippet": "<FILE CONTENTS HIDDEN - Submit {\"request_file\": true} to view>",
95
- "context": "Background worker loading serialized state via network payload",
96
- "pr_title": "Add state persistence layer for distributed workers",
97
- "file_path": "worker/state.py"
98
  }
99
  ```
100
  After `request_file`, `code_snippet` contains the actual source code.
@@ -138,8 +138,8 @@ suggests partial fix (+0.08), correct severity (+0.10) = total `0.20+0.20+0.20+0
138
  | Task | Difficulty | Model | Score | Steps | Notes |
139
  |------|-----------|-------|-------|-------|-------|
140
  | python-off-by-one | easy | Llama-3.3-70B-Instruct | 0.883 | 2 | File request + review |
141
- | js-auth-privilege | medium | Llama-3.3-70B-Instruct | 0.900 | 2 | File request + review |
142
- | python-pickle-deserialization | hard | Llama-3.3-70B-Instruct | TBD | 2 | Requires RCE/deserialization knowledge |
143
 
144
  ---
145
 
 
36
  | ID | Language | Bug Class | Difficulty |
37
  |---|---|---|---|
38
  | `python-off-by-one` | Python | Off-by-one index error | Easy |
39
+ | `js-idor-auth` | JavaScript | Insecure Direct Object Reference (IDOR) | Medium |
40
+ | `python-pickle-deserialization` | Python | Insecure Deserialization (RCE) | Hard |
41
 
42
  ---
43
 
 
79
  |---|---|---|
80
  | `bug_identified` | bool | `true` / `false` |
81
  | `bug_location` | string | location description |
82
+ | `bug_type` | string | `off-by-one` \| `logic-error` \| `insecure-deserialization` \| `none` |
83
  | `bug_description` | string | detailed vulnerability explanation |
84
  | `severity` | string | `none` \| `low` \| `medium` \| `high` \| `critical` |
85
  | `suggested_fix` | string | how to fix the bug |
 
92
  "language": "Python",
93
  "difficulty": "hard",
94
  "code_snippet": "<FILE CONTENTS HIDDEN - Submit {\"request_file\": true} to view>",
95
+ "context": "Redis-backed caching decorator for worker tasks that serializes results...",
96
+ "pr_title": "Add distributed task caching layer for worker pool",
97
+ "file_path": "worker/cache.py"
98
  }
99
  ```
100
  After `request_file`, `code_snippet` contains the actual source code.
 
138
  | Task | Difficulty | Model | Score | Steps | Notes |
139
  |------|-----------|-------|-------|-------|-------|
140
  | python-off-by-one | easy | Llama-3.3-70B-Instruct | 0.883 | 2 | File request + review |
141
+ | js-idor-auth | medium | Llama-3.3-70B-Instruct | 0.500 | 2 | File request + review |
142
+ | python-pickle-deserialization | hard | Llama-3.3-70B-Instruct | 0.512 | 2 | File request + review |
143
 
144
  ---
145
 
openenv.yaml CHANGED
@@ -45,7 +45,7 @@ action_space:
45
  request_file: { type: boolean, description: "Phase 1: Request the hidden file contents" }
46
  bug_identified: { type: boolean, description: "Boolean: true if a bug exists" }
47
  bug_location: { type: string, description: "String: Pinpoint the bug's location in code" }
48
- bug_type: { type: string, description: "String: off-by-one | logic-error | security-vulnerability | none" }
49
  bug_description: { type: string, description: "String: Detailed analysis of the vulnerability" }
50
  severity: { type: string, enum: [none, low, medium, high, critical], description: "String: none | low | medium | high | critical" }
51
  suggested_fix: { type: string, description: "String: How to fix the identified bug" }
 
45
  request_file: { type: boolean, description: "Phase 1: Request the hidden file contents" }
46
  bug_identified: { type: boolean, description: "Boolean: true if a bug exists" }
47
  bug_location: { type: string, description: "String: Pinpoint the bug's location in code" }
48
+ bug_type: { type: string, description: "String: off-by-one | logic-error | insecure-deserialization | none" }
49
  bug_description: { type: string, description: "String: Detailed analysis of the vulnerability" }
50
  severity: { type: string, enum: [none, low, medium, high, critical], description: "String: none | low | medium | high | critical" }
51
  suggested_fix: { type: string, description: "String: How to fix the identified bug" }