Spaces:
Running
Running
Nitish commited on
Commit ·
4a2e8a2
1
Parent(s): 42c4990
docs: final README and openenv.yaml sync with new tasks and baselines
Browse files- README.md +8 -8
- openenv.yaml +1 -1
README.md
CHANGED
|
@@ -36,8 +36,8 @@ Built by **Inmodel Labs** for the Meta PyTorch OpenEnv Hackathon.
|
|
| 36 |
| ID | Language | Bug Class | Difficulty |
|
| 37 |
|---|---|---|---|
|
| 38 |
| `python-off-by-one` | Python | Off-by-one index error | Easy |
|
| 39 |
-
| `js-
|
| 40 |
-
| `python-pickle-deserialization` | Python | Insecure
|
| 41 |
|
| 42 |
---
|
| 43 |
|
|
@@ -79,7 +79,7 @@ The agent analyses the code and submits a structured JSON finding:
|
|
| 79 |
|---|---|---|
|
| 80 |
| `bug_identified` | bool | `true` / `false` |
|
| 81 |
| `bug_location` | string | location description |
|
| 82 |
-
| `bug_type` | string | `off-by-one` \| `logic-error` \| `
|
| 83 |
| `bug_description` | string | detailed vulnerability explanation |
|
| 84 |
| `severity` | string | `none` \| `low` \| `medium` \| `high` \| `critical` |
|
| 85 |
| `suggested_fix` | string | how to fix the bug |
|
|
@@ -92,9 +92,9 @@ The agent analyses the code and submits a structured JSON finding:
|
|
| 92 |
"language": "Python",
|
| 93 |
"difficulty": "hard",
|
| 94 |
"code_snippet": "<FILE CONTENTS HIDDEN - Submit {\"request_file\": true} to view>",
|
| 95 |
-
"context": "
|
| 96 |
-
"pr_title": "Add
|
| 97 |
-
"file_path": "worker/
|
| 98 |
}
|
| 99 |
```
|
| 100 |
After `request_file`, `code_snippet` contains the actual source code.
|
|
@@ -138,8 +138,8 @@ suggests partial fix (+0.08), correct severity (+0.10) = total `0.20+0.20+0.20+0
|
|
| 138 |
| Task | Difficulty | Model | Score | Steps | Notes |
|
| 139 |
|------|-----------|-------|-------|-------|-------|
|
| 140 |
| python-off-by-one | easy | Llama-3.3-70B-Instruct | 0.883 | 2 | File request + review |
|
| 141 |
-
| js-
|
| 142 |
-
| python-pickle-deserialization | hard | Llama-3.3-70B-Instruct |
|
| 143 |
|
| 144 |
---
|
| 145 |
|
|
|
|
| 36 |
| ID | Language | Bug Class | Difficulty |
|
| 37 |
|---|---|---|---|
|
| 38 |
| `python-off-by-one` | Python | Off-by-one index error | Easy |
|
| 39 |
+
| `js-idor-auth` | JavaScript | Insecure Direct Object Reference (IDOR) | Medium |
|
| 40 |
+
| `python-pickle-deserialization` | Python | Insecure Deserialization (RCE) | Hard |
|
| 41 |
|
| 42 |
---
|
| 43 |
|
|
|
|
| 79 |
|---|---|---|
|
| 80 |
| `bug_identified` | bool | `true` / `false` |
|
| 81 |
| `bug_location` | string | location description |
|
| 82 |
+
| `bug_type` | string | `off-by-one` \| `logic-error` \| `insecure-deserialization` \| `none` |
|
| 83 |
| `bug_description` | string | detailed vulnerability explanation |
|
| 84 |
| `severity` | string | `none` \| `low` \| `medium` \| `high` \| `critical` |
|
| 85 |
| `suggested_fix` | string | how to fix the bug |
|
|
|
|
| 92 |
"language": "Python",
|
| 93 |
"difficulty": "hard",
|
| 94 |
"code_snippet": "<FILE CONTENTS HIDDEN - Submit {\"request_file\": true} to view>",
|
| 95 |
+
"context": "Redis-backed caching decorator for worker tasks that serializes results...",
|
| 96 |
+
"pr_title": "Add distributed task caching layer for worker pool",
|
| 97 |
+
"file_path": "worker/cache.py"
|
| 98 |
}
|
| 99 |
```
|
| 100 |
After `request_file`, `code_snippet` contains the actual source code.
|
|
|
|
| 138 |
| Task | Difficulty | Model | Score | Steps | Notes |
|
| 139 |
|------|-----------|-------|-------|-------|-------|
|
| 140 |
| python-off-by-one | easy | Llama-3.3-70B-Instruct | 0.883 | 2 | File request + review |
|
| 141 |
+
| js-idor-auth | medium | Llama-3.3-70B-Instruct | 0.500 | 2 | File request + review |
|
| 142 |
+
| python-pickle-deserialization | hard | Llama-3.3-70B-Instruct | 0.512 | 2 | File request + review |
|
| 143 |
|
| 144 |
---
|
| 145 |
|
openenv.yaml
CHANGED
|
@@ -45,7 +45,7 @@ action_space:
|
|
| 45 |
request_file: { type: boolean, description: "Phase 1: Request the hidden file contents" }
|
| 46 |
bug_identified: { type: boolean, description: "Boolean: true if a bug exists" }
|
| 47 |
bug_location: { type: string, description: "String: Pinpoint the bug's location in code" }
|
| 48 |
-
bug_type: { type: string, description: "String: off-by-one | logic-error |
|
| 49 |
bug_description: { type: string, description: "String: Detailed analysis of the vulnerability" }
|
| 50 |
severity: { type: string, enum: [none, low, medium, high, critical], description: "String: none | low | medium | high | critical" }
|
| 51 |
suggested_fix: { type: string, description: "String: How to fix the identified bug" }
|
|
|
|
| 45 |
request_file: { type: boolean, description: "Phase 1: Request the hidden file contents" }
|
| 46 |
bug_identified: { type: boolean, description: "Boolean: true if a bug exists" }
|
| 47 |
bug_location: { type: string, description: "String: Pinpoint the bug's location in code" }
|
| 48 |
+
bug_type: { type: string, description: "String: off-by-one | logic-error | insecure-deserialization | none" }
|
| 49 |
bug_description: { type: string, description: "String: Detailed analysis of the vulnerability" }
|
| 50 |
severity: { type: string, enum: [none, low, medium, high, critical], description: "String: none | low | medium | high | critical" }
|
| 51 |
suggested_fix: { type: string, description: "String: How to fix the identified bug" }
|