Spaces:
Sleeping
Sleeping
havinashpatil commited on
Commit ·
a448db8
1
Parent(s): 03defc2
Complete all tasks: Adaptive curriculum, GRPO, React frontend, LLM-as-a-judge
Browse files- README.md +56 -62
- create_tasks.py +92 -0
- frontend/package-lock.json +413 -28
- frontend/package.json +2 -1
- frontend/src/CodeArenaRL.jsx +77 -33
- inference.py +48 -17
- openenv.yaml +20 -1
- plot_rewards.py +53 -0
- server/app.py +30 -3
- server/env.py +0 -116
- server/grader.py +65 -8
- tasks/__init__.py +7 -1
- tasks/security_bugs/security_bug_1.json +8 -0
- tasks/security_bugs/security_bug_1.py +21 -0
- tasks/security_bugs/security_bug_2.json +8 -0
- tasks/security_bugs/security_bug_2.py +24 -0
- tasks/security_bugs/security_bug_3.json +8 -0
- tasks/security_bugs/security_bug_3.py +20 -0
- tasks/type_errors/type_error_1.json +8 -0
- tasks/type_errors/type_error_1.py +23 -0
- tasks/type_errors/type_error_2.json +8 -0
- tasks/type_errors/type_error_2.py +24 -0
- tasks/type_errors/type_error_3.json +8 -0
- tasks/type_errors/type_error_3.py +19 -0
- train_grpo.ipynb +138 -0
README.md
CHANGED
|
@@ -1,86 +1,80 @@
|
|
| 1 |
-
|
| 2 |
-
title: CodeArena RL Agent
|
| 3 |
-
emoji: 🤖
|
| 4 |
-
colorFrom: blue
|
| 5 |
-
colorTo: purple
|
| 6 |
-
sdk: docker
|
| 7 |
-
pinned: false
|
| 8 |
-
---
|
| 9 |
|
| 10 |
-
|
| 11 |
|
| 12 |
-
|
| 13 |
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
-
|
| 17 |
-
1. **Easy**: Correcting syntax errors.
|
| 18 |
-
2. **Medium**: Fixing logical bugs.
|
| 19 |
-
3. **Hard**: Algorithm and efficiency optimization.
|
| 20 |
|
| 21 |
-
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
-
##
|
| 24 |
-
`buggy_code` (string): The current state of the source code.
|
| 25 |
-
`error_log` (string): Standard error output or runtime exceptions from previous attempts.
|
| 26 |
-
`test_results` (string): Count of passed vs total unit tests.
|
| 27 |
-
`previous_attempts` (list of strings): Complete history of fixes proposed during the episode.
|
| 28 |
|
| 29 |
-
|
| 30 |
-
`
|
|
|
|
|
|
|
|
|
|
| 31 |
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
|
| 38 |
-
##
|
| 39 |
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
| GET | `/` | Health check |
|
| 46 |
-
|
| 47 |
-
## Setup Instructions
|
| 48 |
|
| 49 |
-
###
|
| 50 |
```bash
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
pip install -r requirements.txt
|
| 54 |
-
uvicorn server.app:app --reload --port 7860
|
| 55 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
| 56 |
|
| 57 |
-
|
| 58 |
```bash
|
| 59 |
-
|
| 60 |
-
|
|
|
|
| 61 |
```
|
| 62 |
|
| 63 |
-
|
| 64 |
```bash
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
-d '{"task_id": "easy"}'
|
| 68 |
```
|
| 69 |
|
| 70 |
-
##
|
| 71 |
|
| 72 |
-
|
|
|
|
| 73 |
```bash
|
| 74 |
-
|
| 75 |
-
python inference.py
|
| 76 |
```
|
|
|
|
| 77 |
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
[STEP] Beginning Step 1
|
| 82 |
-
[STEP] Action taken. Reward received: 0.700. Task ID: easy-1
|
| 83 |
-
[STEP] Beginning Step 2
|
| 84 |
-
[STEP] Action taken. Reward received: 1.000. Task ID: easy-1
|
| 85 |
-
[END] Inference Complete. Executed 2 step(s).
|
| 86 |
-
```
|
|
|
|
| 1 |
+
# CodeArena RL Benchmark
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
+
CodeArena is an OpenEnv-compatible reinforcement learning benchmark for autonomous code repair. In this environment, an agent receives buggy Python code, proposes fixes, and is iteratively evaluated based on test execution feedback and LLM-based quality metrics.
|
| 4 |
|
| 5 |
+
## Features
|
| 6 |
|
| 7 |
+
- **Adaptive Curriculum**: The environment supports an `auto` difficulty mode that dynamically scales task complexity (`easy`, `medium`, `hard`) based on the agent's recent rolling average rewards.
|
| 8 |
+
- **Complex Shaped Rewards**: Rewards are a weighted composite of:
|
| 9 |
+
- `compile_score` (0.2)
|
| 10 |
+
- `test_pass_ratio` (0.4)
|
| 11 |
+
- `efficiency_score` (0.1)
|
| 12 |
+
- `llm_judge_score` (0.3): Correctness, Security, and Code Quality evaluated via LLM-as-a-judge.
|
| 13 |
+
- **Novelty & Step Penalties**: The agent receives penalties for repeating identical failed fixes or taking too many steps.
|
| 14 |
+
- **Extensive Task Categories**: Includes standard algorithmic tasks, `type_errors`, and `security_bugs`.
|
| 15 |
+
- **Live React Frontend**: Connect a local LLM (like Ollama) or HuggingFace models to interactively visualize step-by-step progress, execution outputs, and live reward components.
|
| 16 |
|
| 17 |
+
## Architecture
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
+
- `server/`: FastAPI backend acting as the OpenEnv entrypoint. Handles state, execution sandbox (`executor.py`), and reward grading (`grader.py`).
|
| 20 |
+
- `frontend/`: React + Vite frontend for live monitoring and manual intervention.
|
| 21 |
+
- `tasks/`: Task definitions stored in OpenEnv-compatible JSON schema.
|
| 22 |
+
- `inference.py`: CLI runner for evaluating RL agents, supporting both OpenAI-compatible APIs and native HuggingFace `transformers` pipelines.
|
| 23 |
|
| 24 |
+
## Setup
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
|
| 26 |
+
1. **Install Dependencies:**
|
| 27 |
+
```bash
|
| 28 |
+
pip install -r requirements.txt
|
| 29 |
+
cd frontend && npm install
|
| 30 |
+
```
|
| 31 |
|
| 32 |
+
2. **Generate New Tasks:**
|
| 33 |
+
To populate the extended task categories (`type_errors` and `security_bugs`), run:
|
| 34 |
+
```bash
|
| 35 |
+
python create_tasks.py
|
| 36 |
+
```
|
| 37 |
|
| 38 |
+
## Usage
|
| 39 |
|
| 40 |
+
### 1. Run the Backend Server
|
| 41 |
+
The server is required for both the frontend dashboard and RL training.
|
| 42 |
+
```bash
|
| 43 |
+
uvicorn server.app:app --port 7860
|
| 44 |
+
```
|
|
|
|
|
|
|
|
|
|
| 45 |
|
| 46 |
+
### 2. Run the Frontend Dashboard
|
| 47 |
```bash
|
| 48 |
+
cd frontend
|
| 49 |
+
npm run dev
|
|
|
|
|
|
|
| 50 |
```
|
| 51 |
+
Navigate to `http://localhost:3000` to access the live RL monitoring dashboard.
|
| 52 |
+
|
| 53 |
+
### 3. Run Inference Evaluation
|
| 54 |
+
You can evaluate a local agent or pipeline programmatically via `inference.py`.
|
| 55 |
|
| 56 |
+
**Using OpenAI-Compatible Endpoints (e.g., Ollama or vLLM):**
|
| 57 |
```bash
|
| 58 |
+
export API_BASE_URL="http://localhost:11434/v1"
|
| 59 |
+
export MODEL_NAME="codellama"
|
| 60 |
+
python inference.py --backend openai
|
| 61 |
```
|
| 62 |
|
| 63 |
+
**Using HuggingFace Transformers (Local pipeline):**
|
| 64 |
```bash
|
| 65 |
+
export MODEL_NAME="Qwen/Qwen2.5-Coder-1.5B"
|
| 66 |
+
python inference.py --backend hf
|
|
|
|
| 67 |
```
|
| 68 |
|
| 69 |
+
## Reward Analysis
|
| 70 |
|
| 71 |
+
As your agent interacts with the environment, inference logs are automatically written to `rewards_log.csv`.
|
| 72 |
+
To visualize the reward curves over training steps and average rewards by task category, run:
|
| 73 |
```bash
|
| 74 |
+
python plot_rewards.py
|
|
|
|
| 75 |
```
|
| 76 |
+
This generates `reward_curve.png` and `reward_by_task.png` in the `results/` directory.
|
| 77 |
|
| 78 |
+
## OpenEnv Compatibility
|
| 79 |
+
|
| 80 |
+
This benchmark strictly adheres to the OpenEnv specification. See `openenv.yaml` for full configuration details.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
create_tasks.py
ADDED
|
@@ -0,0 +1,92 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
import json
|
| 3 |
+
|
| 4 |
+
base_dir = "e:/meta/tasks"
|
| 5 |
+
os.makedirs(os.path.join(base_dir, "type_errors"), exist_ok=True)
|
| 6 |
+
os.makedirs(os.path.join(base_dir, "security_bugs"), exist_ok=True)
|
| 7 |
+
|
| 8 |
+
def write_task(folder, name, task_id, difficulty, desc, buggy, test):
|
| 9 |
+
py_path = os.path.join(base_dir, folder, f"{name}.py")
|
| 10 |
+
json_path = os.path.join(base_dir, folder, f"{name}.json")
|
| 11 |
+
|
| 12 |
+
py_content = f'''from server.models import TaskInfo
|
| 13 |
+
|
| 14 |
+
TASK = TaskInfo(
|
| 15 |
+
task_id="{task_id}",
|
| 16 |
+
difficulty="{difficulty}",
|
| 17 |
+
description="{desc}",
|
| 18 |
+
buggy_code="""{buggy}""",
|
| 19 |
+
test_code="""{test}""",
|
| 20 |
+
optimal_time_seconds=0.05
|
| 21 |
+
)
|
| 22 |
+
'''
|
| 23 |
+
with open(py_path, "w", encoding="utf-8") as f:
|
| 24 |
+
f.write(py_content)
|
| 25 |
+
|
| 26 |
+
json_content = {
|
| 27 |
+
"task_id": task_id,
|
| 28 |
+
"difficulty": difficulty,
|
| 29 |
+
"description": desc,
|
| 30 |
+
"buggy_code": buggy,
|
| 31 |
+
"test_code": test,
|
| 32 |
+
"optimal_time_seconds": 0.05
|
| 33 |
+
}
|
| 34 |
+
with open(json_path, "w", encoding="utf-8") as f:
|
| 35 |
+
json.dump(json_content, f, indent=2)
|
| 36 |
+
|
| 37 |
+
|
| 38 |
+
# Type Error 1
|
| 39 |
+
write_task("type_errors", "type_error_1", "type_errors-1", "type_errors",
|
| 40 |
+
"Fix the function to sum a list of numbers that might be passed as strings. It currently tries to add int and str.",
|
| 41 |
+
"def sum_all(items):\n total = 0\n for item in items:\n total = total + item\n return total",
|
| 42 |
+
"\nimport unittest\nclass TestTypeError1(unittest.TestCase):\n def test_normal(self):\n self.assertEqual(sum_all([1, 2, 3]), 6)\n def test_strings(self):\n self.assertEqual(sum_all(['1', '2', '3']), 6)\n def test_mixed(self):\n self.assertEqual(sum_all([1, '2', 3]), 6)\n")
|
| 43 |
+
|
| 44 |
+
# Type Error 2
|
| 45 |
+
write_task("type_errors", "type_error_2", "type_errors-2", "type_errors",
|
| 46 |
+
"Fix the function to count frequencies. It incorrectly calls .append() on a dict.",
|
| 47 |
+
"def count_frequencies(words):\n counts = {}\n for word in words:\n if word not in counts:\n counts.append({word: 1})\n else:\n counts[word] += 1\n return counts",
|
| 48 |
+
"\nimport unittest\nclass TestTypeError2(unittest.TestCase):\n def test_normal(self):\n self.assertEqual(count_frequencies(['apple', 'banana', 'apple']), {'apple': 2, 'banana': 1})\n def test_empty(self):\n self.assertEqual(count_frequencies([]), {})\n")
|
| 49 |
+
|
| 50 |
+
# Type Error 3
|
| 51 |
+
write_task("type_errors", "type_error_3", "type_errors-3", "type_errors",
|
| 52 |
+
"Fix the function to format names. It incorrectly calls .upper() on an int ID.",
|
| 53 |
+
"def format_records(records):\n formatted = []\n for user_id, name in records:\n formatted.append(f\"{user_id.upper()} - {name.upper()}\")\n return formatted",
|
| 54 |
+
"\nimport unittest\nclass TestTypeError3(unittest.TestCase):\n def test_normal(self):\n self.assertEqual(format_records([(1, 'alice'), (2, 'bob')]), ['1 - ALICE', '2 - BOB'])\n")
|
| 55 |
+
|
| 56 |
+
|
| 57 |
+
# Security Bug 1
|
| 58 |
+
write_task("security_bugs", "security_bug_1", "security_bugs-1", "security_bugs",
|
| 59 |
+
"Fix the function to parse JSON safely without using eval().",
|
| 60 |
+
"import json\ndef parse_user_data(data_string):\n return eval(data_string)",
|
| 61 |
+
"\nimport unittest\nimport inspect\nclass TestSecurity1(unittest.TestCase):\n def test_normal(self):\n self.assertEqual(parse_user_data('{\"name\": \"alice\"}'), {\"name\": \"alice\"})\n def test_security(self):\n source = inspect.getsource(parse_user_data)\n self.assertNotIn(\"eval(\", source)\n")
|
| 62 |
+
|
| 63 |
+
# Security Bug 2
|
| 64 |
+
write_task("security_bugs", "security_bug_2", "security_bugs-2", "security_bugs",
|
| 65 |
+
"Remove the hardcoded secret token and load it from the os.environ dictionary as 'API_TOKEN'.",
|
| 66 |
+
"import os\ndef get_api_token():\n token = \"secret_12345\"\n return token",
|
| 67 |
+
"\nimport unittest\nimport inspect\nimport os\nclass TestSecurity2(unittest.TestCase):\n def test_normal(self):\n os.environ['API_TOKEN'] = 'my_secure_token'\n self.assertEqual(get_api_token(), 'my_secure_token')\n def test_security(self):\n source = inspect.getsource(get_api_token)\n self.assertNotIn(\"secret_12345\", source)\n")
|
| 68 |
+
|
| 69 |
+
# Security Bug 3
|
| 70 |
+
write_task("security_bugs", "security_bug_3", "security_bugs-3", "security_bugs",
|
| 71 |
+
"Fix the ping command to avoid shell injection. Use a list of arguments and shell=False.",
|
| 72 |
+
"import subprocess\ndef ping_host(host):\n return subprocess.check_output(f\"ping -c 1 {host}\", shell=True)",
|
| 73 |
+
"\nimport unittest\nimport inspect\nclass TestSecurity3(unittest.TestCase):\n def test_security(self):\n source = inspect.getsource(ping_host)\n self.assertNotIn(\"shell=True\", source.replace(\" \", \"\"))\n self.assertIn(\"[\", source)\n")
|
| 74 |
+
|
| 75 |
+
# Rewrite __init__.py
|
| 76 |
+
init_content = """from .easy import EASY_TASK
|
| 77 |
+
from .medium import MEDIUM_TASK
|
| 78 |
+
from .hard import HARD_TASK
|
| 79 |
+
from .type_errors.type_error_1 import TASK as TE1
|
| 80 |
+
from .type_errors.type_error_2 import TASK as TE2
|
| 81 |
+
from .type_errors.type_error_3 import TASK as TE3
|
| 82 |
+
from .security_bugs.security_bug_1 import TASK as SB1
|
| 83 |
+
from .security_bugs.security_bug_2 import TASK as SB2
|
| 84 |
+
from .security_bugs.security_bug_3 import TASK as SB3
|
| 85 |
+
|
| 86 |
+
ALL_TASKS = [EASY_TASK, MEDIUM_TASK, HARD_TASK, TE1, TE2, TE3, SB1, SB2, SB3]
|
| 87 |
+
"""
|
| 88 |
+
|
| 89 |
+
with open(os.path.join(base_dir, "__init__.py"), "w", encoding="utf-8") as f:
|
| 90 |
+
f.write(init_content)
|
| 91 |
+
|
| 92 |
+
print("Tasks generated successfully!")
|
frontend/package-lock.json
CHANGED
|
@@ -9,7 +9,8 @@
|
|
| 9 |
"version": "0.0.0",
|
| 10 |
"dependencies": {
|
| 11 |
"react": "^19.2.5",
|
| 12 |
-
"react-dom": "^19.2.5"
|
|
|
|
| 13 |
},
|
| 14 |
"devDependencies": {
|
| 15 |
"@eslint/js": "^9.39.4",
|
|
@@ -264,31 +265,6 @@
|
|
| 264 |
"node": ">=6.9.0"
|
| 265 |
}
|
| 266 |
},
|
| 267 |
-
"node_modules/@emnapi/core": {
|
| 268 |
-
"version": "1.9.2",
|
| 269 |
-
"resolved": "https://registry.npmjs.org/@emnapi/core/-/core-1.9.2.tgz",
|
| 270 |
-
"integrity": "sha512-UC+ZhH3XtczQYfOlu3lNEkdW/p4dsJ1r/bP7H8+rhao3TTTMO1ATq/4DdIi23XuGoFY+Cz0JmCbdVl0hz9jZcA==",
|
| 271 |
-
"dev": true,
|
| 272 |
-
"license": "MIT",
|
| 273 |
-
"optional": true,
|
| 274 |
-
"peer": true,
|
| 275 |
-
"dependencies": {
|
| 276 |
-
"@emnapi/wasi-threads": "1.2.1",
|
| 277 |
-
"tslib": "^2.4.0"
|
| 278 |
-
}
|
| 279 |
-
},
|
| 280 |
-
"node_modules/@emnapi/runtime": {
|
| 281 |
-
"version": "1.9.2",
|
| 282 |
-
"resolved": "https://registry.npmjs.org/@emnapi/runtime/-/runtime-1.9.2.tgz",
|
| 283 |
-
"integrity": "sha512-3U4+MIWHImeyu1wnmVygh5WlgfYDtyf0k8AbLhMFxOipihf6nrWC4syIm/SwEeec0mNSafiiNnMJwbza/Is6Lw==",
|
| 284 |
-
"dev": true,
|
| 285 |
-
"license": "MIT",
|
| 286 |
-
"optional": true,
|
| 287 |
-
"peer": true,
|
| 288 |
-
"dependencies": {
|
| 289 |
-
"tslib": "^2.4.0"
|
| 290 |
-
}
|
| 291 |
-
},
|
| 292 |
"node_modules/@emnapi/wasi-threads": {
|
| 293 |
"version": "1.2.1",
|
| 294 |
"resolved": "https://registry.npmjs.org/@emnapi/wasi-threads/-/wasi-threads-1.2.1.tgz",
|
|
@@ -602,6 +578,42 @@
|
|
| 602 |
"url": "https://github.com/sponsors/Boshen"
|
| 603 |
}
|
| 604 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 605 |
"node_modules/@rolldown/binding-android-arm64": {
|
| 606 |
"version": "1.0.0-rc.16",
|
| 607 |
"resolved": "https://registry.npmjs.org/@rolldown/binding-android-arm64/-/binding-android-arm64-1.0.0-rc.16.tgz",
|
|
@@ -866,6 +878,18 @@
|
|
| 866 |
"dev": true,
|
| 867 |
"license": "MIT"
|
| 868 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 869 |
"node_modules/@tybys/wasm-util": {
|
| 870 |
"version": "0.10.1",
|
| 871 |
"resolved": "https://registry.npmjs.org/@tybys/wasm-util/-/wasm-util-0.10.1.tgz",
|
|
@@ -877,6 +901,69 @@
|
|
| 877 |
"tslib": "^2.4.0"
|
| 878 |
}
|
| 879 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 880 |
"node_modules/@types/estree": {
|
| 881 |
"version": "1.0.8",
|
| 882 |
"resolved": "https://registry.npmjs.org/@types/estree/-/estree-1.0.8.tgz",
|
|
@@ -895,7 +982,7 @@
|
|
| 895 |
"version": "19.2.14",
|
| 896 |
"resolved": "https://registry.npmjs.org/@types/react/-/react-19.2.14.tgz",
|
| 897 |
"integrity": "sha512-ilcTH/UniCkMdtexkoCN0bI7pMcJDvmQFPvuPvmEaYA/NSfFTAgdUSLAoVjaRJm7+6PvcM+q1zYOwS4wTYMF9w==",
|
| 898 |
-
"
|
| 899 |
"license": "MIT",
|
| 900 |
"peer": true,
|
| 901 |
"dependencies": {
|
|
@@ -912,6 +999,12 @@
|
|
| 912 |
"@types/react": "^19.2.0"
|
| 913 |
}
|
| 914 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 915 |
"node_modules/@vitejs/plugin-react": {
|
| 916 |
"version": "6.0.1",
|
| 917 |
"resolved": "https://registry.npmjs.org/@vitejs/plugin-react/-/plugin-react-6.0.1.tgz",
|
|
@@ -1116,6 +1209,15 @@
|
|
| 1116 |
"url": "https://github.com/chalk/chalk?sponsor=1"
|
| 1117 |
}
|
| 1118 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1119 |
"node_modules/color-convert": {
|
| 1120 |
"version": "2.0.1",
|
| 1121 |
"resolved": "https://registry.npmjs.org/color-convert/-/color-convert-2.0.1.tgz",
|
|
@@ -1169,9 +1271,130 @@
|
|
| 1169 |
"version": "3.2.3",
|
| 1170 |
"resolved": "https://registry.npmjs.org/csstype/-/csstype-3.2.3.tgz",
|
| 1171 |
"integrity": "sha512-z1HGKcYy2xA8AGQfwrn0PAy+PB7X/GSj3UVJW9qKyn43xWa+gl5nXmU4qqLMRzWVLFC8KusUX8T/0kCiOYpAIQ==",
|
| 1172 |
-
"
|
| 1173 |
"license": "MIT"
|
| 1174 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1175 |
"node_modules/debug": {
|
| 1176 |
"version": "4.4.3",
|
| 1177 |
"resolved": "https://registry.npmjs.org/debug/-/debug-4.4.3.tgz",
|
|
@@ -1190,6 +1413,12 @@
|
|
| 1190 |
}
|
| 1191 |
}
|
| 1192 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1193 |
"node_modules/deep-is": {
|
| 1194 |
"version": "0.1.4",
|
| 1195 |
"resolved": "https://registry.npmjs.org/deep-is/-/deep-is-0.1.4.tgz",
|
|
@@ -1214,6 +1443,16 @@
|
|
| 1214 |
"dev": true,
|
| 1215 |
"license": "ISC"
|
| 1216 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1217 |
"node_modules/escalade": {
|
| 1218 |
"version": "3.2.0",
|
| 1219 |
"resolved": "https://registry.npmjs.org/escalade/-/escalade-3.2.0.tgz",
|
|
@@ -1422,6 +1661,12 @@
|
|
| 1422 |
"node": ">=0.10.0"
|
| 1423 |
}
|
| 1424 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1425 |
"node_modules/fast-deep-equal": {
|
| 1426 |
"version": "3.1.3",
|
| 1427 |
"resolved": "https://registry.npmjs.org/fast-deep-equal/-/fast-deep-equal-3.1.3.tgz",
|
|
@@ -1600,6 +1845,16 @@
|
|
| 1600 |
"node": ">= 4"
|
| 1601 |
}
|
| 1602 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1603 |
"node_modules/import-fresh": {
|
| 1604 |
"version": "3.3.1",
|
| 1605 |
"resolved": "https://registry.npmjs.org/import-fresh/-/import-fresh-3.3.1.tgz",
|
|
@@ -1627,6 +1882,15 @@
|
|
| 1627 |
"node": ">=0.8.19"
|
| 1628 |
}
|
| 1629 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1630 |
"node_modules/is-extglob": {
|
| 1631 |
"version": "2.1.1",
|
| 1632 |
"resolved": "https://registry.npmjs.org/is-extglob/-/is-extglob-2.1.1.tgz",
|
|
@@ -2263,6 +2527,7 @@
|
|
| 2263 |
"resolved": "https://registry.npmjs.org/react-dom/-/react-dom-19.2.5.tgz",
|
| 2264 |
"integrity": "sha512-J5bAZz+DXMMwW/wV3xzKke59Af6CHY7G4uYLN1OvBcKEsWOs4pQExj86BBKamxl/Ik5bx9whOrvBlSDfWzgSag==",
|
| 2265 |
"license": "MIT",
|
|
|
|
| 2266 |
"dependencies": {
|
| 2267 |
"scheduler": "^0.27.0"
|
| 2268 |
},
|
|
@@ -2270,6 +2535,89 @@
|
|
| 2270 |
"react": "^19.2.5"
|
| 2271 |
}
|
| 2272 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2273 |
"node_modules/resolve-from": {
|
| 2274 |
"version": "4.0.0",
|
| 2275 |
"resolved": "https://registry.npmjs.org/resolve-from/-/resolve-from-4.0.0.tgz",
|
|
@@ -2396,6 +2744,12 @@
|
|
| 2396 |
"node": ">=8"
|
| 2397 |
}
|
| 2398 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2399 |
"node_modules/tinyglobby": {
|
| 2400 |
"version": "0.2.16",
|
| 2401 |
"resolved": "https://registry.npmjs.org/tinyglobby/-/tinyglobby-0.2.16.tgz",
|
|
@@ -2475,6 +2829,37 @@
|
|
| 2475 |
"punycode": "^2.1.0"
|
| 2476 |
}
|
| 2477 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2478 |
"node_modules/vite": {
|
| 2479 |
"version": "8.0.9",
|
| 2480 |
"resolved": "https://registry.npmjs.org/vite/-/vite-8.0.9.tgz",
|
|
|
|
| 9 |
"version": "0.0.0",
|
| 10 |
"dependencies": {
|
| 11 |
"react": "^19.2.5",
|
| 12 |
+
"react-dom": "^19.2.5",
|
| 13 |
+
"recharts": "^3.8.1"
|
| 14 |
},
|
| 15 |
"devDependencies": {
|
| 16 |
"@eslint/js": "^9.39.4",
|
|
|
|
| 265 |
"node": ">=6.9.0"
|
| 266 |
}
|
| 267 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 268 |
"node_modules/@emnapi/wasi-threads": {
|
| 269 |
"version": "1.2.1",
|
| 270 |
"resolved": "https://registry.npmjs.org/@emnapi/wasi-threads/-/wasi-threads-1.2.1.tgz",
|
|
|
|
| 578 |
"url": "https://github.com/sponsors/Boshen"
|
| 579 |
}
|
| 580 |
},
|
| 581 |
+
"node_modules/@reduxjs/toolkit": {
|
| 582 |
+
"version": "2.11.2",
|
| 583 |
+
"resolved": "https://registry.npmjs.org/@reduxjs/toolkit/-/toolkit-2.11.2.tgz",
|
| 584 |
+
"integrity": "sha512-Kd6kAHTA6/nUpp8mySPqj3en3dm0tdMIgbttnQ1xFMVpufoj+ADi8pXLBsd4xzTRHQa7t/Jv8W5UnCuW4kuWMQ==",
|
| 585 |
+
"license": "MIT",
|
| 586 |
+
"dependencies": {
|
| 587 |
+
"@standard-schema/spec": "^1.0.0",
|
| 588 |
+
"@standard-schema/utils": "^0.3.0",
|
| 589 |
+
"immer": "^11.0.0",
|
| 590 |
+
"redux": "^5.0.1",
|
| 591 |
+
"redux-thunk": "^3.1.0",
|
| 592 |
+
"reselect": "^5.1.0"
|
| 593 |
+
},
|
| 594 |
+
"peerDependencies": {
|
| 595 |
+
"react": "^16.9.0 || ^17.0.0 || ^18 || ^19",
|
| 596 |
+
"react-redux": "^7.2.1 || ^8.1.3 || ^9.0.0"
|
| 597 |
+
},
|
| 598 |
+
"peerDependenciesMeta": {
|
| 599 |
+
"react": {
|
| 600 |
+
"optional": true
|
| 601 |
+
},
|
| 602 |
+
"react-redux": {
|
| 603 |
+
"optional": true
|
| 604 |
+
}
|
| 605 |
+
}
|
| 606 |
+
},
|
| 607 |
+
"node_modules/@reduxjs/toolkit/node_modules/immer": {
|
| 608 |
+
"version": "11.1.4",
|
| 609 |
+
"resolved": "https://registry.npmjs.org/immer/-/immer-11.1.4.tgz",
|
| 610 |
+
"integrity": "sha512-XREFCPo6ksxVzP4E0ekD5aMdf8WMwmdNaz6vuvxgI40UaEiu6q3p8X52aU6GdyvLY3XXX/8R7JOTXStz/nBbRw==",
|
| 611 |
+
"license": "MIT",
|
| 612 |
+
"funding": {
|
| 613 |
+
"type": "opencollective",
|
| 614 |
+
"url": "https://opencollective.com/immer"
|
| 615 |
+
}
|
| 616 |
+
},
|
| 617 |
"node_modules/@rolldown/binding-android-arm64": {
|
| 618 |
"version": "1.0.0-rc.16",
|
| 619 |
"resolved": "https://registry.npmjs.org/@rolldown/binding-android-arm64/-/binding-android-arm64-1.0.0-rc.16.tgz",
|
|
|
|
| 878 |
"dev": true,
|
| 879 |
"license": "MIT"
|
| 880 |
},
|
| 881 |
+
"node_modules/@standard-schema/spec": {
|
| 882 |
+
"version": "1.1.0",
|
| 883 |
+
"resolved": "https://registry.npmjs.org/@standard-schema/spec/-/spec-1.1.0.tgz",
|
| 884 |
+
"integrity": "sha512-l2aFy5jALhniG5HgqrD6jXLi/rUWrKvqN/qJx6yoJsgKhblVd+iqqU4RCXavm/jPityDo5TCvKMnpjKnOriy0w==",
|
| 885 |
+
"license": "MIT"
|
| 886 |
+
},
|
| 887 |
+
"node_modules/@standard-schema/utils": {
|
| 888 |
+
"version": "0.3.0",
|
| 889 |
+
"resolved": "https://registry.npmjs.org/@standard-schema/utils/-/utils-0.3.0.tgz",
|
| 890 |
+
"integrity": "sha512-e7Mew686owMaPJVNNLs55PUvgz371nKgwsc4vxE49zsODpJEnxgxRo2y/OKrqueavXgZNMDVj3DdHFlaSAeU8g==",
|
| 891 |
+
"license": "MIT"
|
| 892 |
+
},
|
| 893 |
"node_modules/@tybys/wasm-util": {
|
| 894 |
"version": "0.10.1",
|
| 895 |
"resolved": "https://registry.npmjs.org/@tybys/wasm-util/-/wasm-util-0.10.1.tgz",
|
|
|
|
| 901 |
"tslib": "^2.4.0"
|
| 902 |
}
|
| 903 |
},
|
| 904 |
+
"node_modules/@types/d3-array": {
|
| 905 |
+
"version": "3.2.2",
|
| 906 |
+
"resolved": "https://registry.npmjs.org/@types/d3-array/-/d3-array-3.2.2.tgz",
|
| 907 |
+
"integrity": "sha512-hOLWVbm7uRza0BYXpIIW5pxfrKe0W+D5lrFiAEYR+pb6w3N2SwSMaJbXdUfSEv+dT4MfHBLtn5js0LAWaO6otw==",
|
| 908 |
+
"license": "MIT"
|
| 909 |
+
},
|
| 910 |
+
"node_modules/@types/d3-color": {
|
| 911 |
+
"version": "3.1.3",
|
| 912 |
+
"resolved": "https://registry.npmjs.org/@types/d3-color/-/d3-color-3.1.3.tgz",
|
| 913 |
+
"integrity": "sha512-iO90scth9WAbmgv7ogoq57O9YpKmFBbmoEoCHDB2xMBY0+/KVrqAaCDyCE16dUspeOvIxFFRI+0sEtqDqy2b4A==",
|
| 914 |
+
"license": "MIT"
|
| 915 |
+
},
|
| 916 |
+
"node_modules/@types/d3-ease": {
|
| 917 |
+
"version": "3.0.2",
|
| 918 |
+
"resolved": "https://registry.npmjs.org/@types/d3-ease/-/d3-ease-3.0.2.tgz",
|
| 919 |
+
"integrity": "sha512-NcV1JjO5oDzoK26oMzbILE6HW7uVXOHLQvHshBUW4UMdZGfiY6v5BeQwh9a9tCzv+CeefZQHJt5SRgK154RtiA==",
|
| 920 |
+
"license": "MIT"
|
| 921 |
+
},
|
| 922 |
+
"node_modules/@types/d3-interpolate": {
|
| 923 |
+
"version": "3.0.4",
|
| 924 |
+
"resolved": "https://registry.npmjs.org/@types/d3-interpolate/-/d3-interpolate-3.0.4.tgz",
|
| 925 |
+
"integrity": "sha512-mgLPETlrpVV1YRJIglr4Ez47g7Yxjl1lj7YKsiMCb27VJH9W8NVM6Bb9d8kkpG/uAQS5AmbA48q2IAolKKo1MA==",
|
| 926 |
+
"license": "MIT",
|
| 927 |
+
"dependencies": {
|
| 928 |
+
"@types/d3-color": "*"
|
| 929 |
+
}
|
| 930 |
+
},
|
| 931 |
+
"node_modules/@types/d3-path": {
|
| 932 |
+
"version": "3.1.1",
|
| 933 |
+
"resolved": "https://registry.npmjs.org/@types/d3-path/-/d3-path-3.1.1.tgz",
|
| 934 |
+
"integrity": "sha512-VMZBYyQvbGmWyWVea0EHs/BwLgxc+MKi1zLDCONksozI4YJMcTt8ZEuIR4Sb1MMTE8MMW49v0IwI5+b7RmfWlg==",
|
| 935 |
+
"license": "MIT"
|
| 936 |
+
},
|
| 937 |
+
"node_modules/@types/d3-scale": {
|
| 938 |
+
"version": "4.0.9",
|
| 939 |
+
"resolved": "https://registry.npmjs.org/@types/d3-scale/-/d3-scale-4.0.9.tgz",
|
| 940 |
+
"integrity": "sha512-dLmtwB8zkAeO/juAMfnV+sItKjlsw2lKdZVVy6LRr0cBmegxSABiLEpGVmSJJ8O08i4+sGR6qQtb6WtuwJdvVw==",
|
| 941 |
+
"license": "MIT",
|
| 942 |
+
"dependencies": {
|
| 943 |
+
"@types/d3-time": "*"
|
| 944 |
+
}
|
| 945 |
+
},
|
| 946 |
+
"node_modules/@types/d3-shape": {
|
| 947 |
+
"version": "3.1.8",
|
| 948 |
+
"resolved": "https://registry.npmjs.org/@types/d3-shape/-/d3-shape-3.1.8.tgz",
|
| 949 |
+
"integrity": "sha512-lae0iWfcDeR7qt7rA88BNiqdvPS5pFVPpo5OfjElwNaT2yyekbM0C9vK+yqBqEmHr6lDkRnYNoTBYlAgJa7a4w==",
|
| 950 |
+
"license": "MIT",
|
| 951 |
+
"dependencies": {
|
| 952 |
+
"@types/d3-path": "*"
|
| 953 |
+
}
|
| 954 |
+
},
|
| 955 |
+
"node_modules/@types/d3-time": {
|
| 956 |
+
"version": "3.0.4",
|
| 957 |
+
"resolved": "https://registry.npmjs.org/@types/d3-time/-/d3-time-3.0.4.tgz",
|
| 958 |
+
"integrity": "sha512-yuzZug1nkAAaBlBBikKZTgzCeA+k1uy4ZFwWANOfKw5z5LRhV0gNA7gNkKm7HoK+HRN0wX3EkxGk0fpbWhmB7g==",
|
| 959 |
+
"license": "MIT"
|
| 960 |
+
},
|
| 961 |
+
"node_modules/@types/d3-timer": {
|
| 962 |
+
"version": "3.0.2",
|
| 963 |
+
"resolved": "https://registry.npmjs.org/@types/d3-timer/-/d3-timer-3.0.2.tgz",
|
| 964 |
+
"integrity": "sha512-Ps3T8E8dZDam6fUyNiMkekK3XUsaUEik+idO9/YjPtfj2qruF8tFBXS7XhtE4iIXBLxhmLjP3SXpLhVf21I9Lw==",
|
| 965 |
+
"license": "MIT"
|
| 966 |
+
},
|
| 967 |
"node_modules/@types/estree": {
|
| 968 |
"version": "1.0.8",
|
| 969 |
"resolved": "https://registry.npmjs.org/@types/estree/-/estree-1.0.8.tgz",
|
|
|
|
| 982 |
"version": "19.2.14",
|
| 983 |
"resolved": "https://registry.npmjs.org/@types/react/-/react-19.2.14.tgz",
|
| 984 |
"integrity": "sha512-ilcTH/UniCkMdtexkoCN0bI7pMcJDvmQFPvuPvmEaYA/NSfFTAgdUSLAoVjaRJm7+6PvcM+q1zYOwS4wTYMF9w==",
|
| 985 |
+
"devOptional": true,
|
| 986 |
"license": "MIT",
|
| 987 |
"peer": true,
|
| 988 |
"dependencies": {
|
|
|
|
| 999 |
"@types/react": "^19.2.0"
|
| 1000 |
}
|
| 1001 |
},
|
| 1002 |
+
"node_modules/@types/use-sync-external-store": {
|
| 1003 |
+
"version": "0.0.6",
|
| 1004 |
+
"resolved": "https://registry.npmjs.org/@types/use-sync-external-store/-/use-sync-external-store-0.0.6.tgz",
|
| 1005 |
+
"integrity": "sha512-zFDAD+tlpf2r4asuHEj0XH6pY6i0g5NeAHPn+15wk3BV6JA69eERFXC1gyGThDkVa1zCyKr5jox1+2LbV/AMLg==",
|
| 1006 |
+
"license": "MIT"
|
| 1007 |
+
},
|
| 1008 |
"node_modules/@vitejs/plugin-react": {
|
| 1009 |
"version": "6.0.1",
|
| 1010 |
"resolved": "https://registry.npmjs.org/@vitejs/plugin-react/-/plugin-react-6.0.1.tgz",
|
|
|
|
| 1209 |
"url": "https://github.com/chalk/chalk?sponsor=1"
|
| 1210 |
}
|
| 1211 |
},
|
| 1212 |
+
"node_modules/clsx": {
|
| 1213 |
+
"version": "2.1.1",
|
| 1214 |
+
"resolved": "https://registry.npmjs.org/clsx/-/clsx-2.1.1.tgz",
|
| 1215 |
+
"integrity": "sha512-eYm0QWBtUrBWZWG0d386OGAw16Z995PiOVo2B7bjWSbHedGl5e0ZWaq65kOGgUSNesEIDkB9ISbTg/JK9dhCZA==",
|
| 1216 |
+
"license": "MIT",
|
| 1217 |
+
"engines": {
|
| 1218 |
+
"node": ">=6"
|
| 1219 |
+
}
|
| 1220 |
+
},
|
| 1221 |
"node_modules/color-convert": {
|
| 1222 |
"version": "2.0.1",
|
| 1223 |
"resolved": "https://registry.npmjs.org/color-convert/-/color-convert-2.0.1.tgz",
|
|
|
|
| 1271 |
"version": "3.2.3",
|
| 1272 |
"resolved": "https://registry.npmjs.org/csstype/-/csstype-3.2.3.tgz",
|
| 1273 |
"integrity": "sha512-z1HGKcYy2xA8AGQfwrn0PAy+PB7X/GSj3UVJW9qKyn43xWa+gl5nXmU4qqLMRzWVLFC8KusUX8T/0kCiOYpAIQ==",
|
| 1274 |
+
"devOptional": true,
|
| 1275 |
"license": "MIT"
|
| 1276 |
},
|
| 1277 |
+
"node_modules/d3-array": {
|
| 1278 |
+
"version": "3.2.4",
|
| 1279 |
+
"resolved": "https://registry.npmjs.org/d3-array/-/d3-array-3.2.4.tgz",
|
| 1280 |
+
"integrity": "sha512-tdQAmyA18i4J7wprpYq8ClcxZy3SC31QMeByyCFyRt7BVHdREQZ5lpzoe5mFEYZUWe+oq8HBvk9JjpibyEV4Jg==",
|
| 1281 |
+
"license": "ISC",
|
| 1282 |
+
"dependencies": {
|
| 1283 |
+
"internmap": "1 - 2"
|
| 1284 |
+
},
|
| 1285 |
+
"engines": {
|
| 1286 |
+
"node": ">=12"
|
| 1287 |
+
}
|
| 1288 |
+
},
|
| 1289 |
+
"node_modules/d3-color": {
|
| 1290 |
+
"version": "3.1.0",
|
| 1291 |
+
"resolved": "https://registry.npmjs.org/d3-color/-/d3-color-3.1.0.tgz",
|
| 1292 |
+
"integrity": "sha512-zg/chbXyeBtMQ1LbD/WSoW2DpC3I0mpmPdW+ynRTj/x2DAWYrIY7qeZIHidozwV24m4iavr15lNwIwLxRmOxhA==",
|
| 1293 |
+
"license": "ISC",
|
| 1294 |
+
"engines": {
|
| 1295 |
+
"node": ">=12"
|
| 1296 |
+
}
|
| 1297 |
+
},
|
| 1298 |
+
"node_modules/d3-ease": {
|
| 1299 |
+
"version": "3.0.1",
|
| 1300 |
+
"resolved": "https://registry.npmjs.org/d3-ease/-/d3-ease-3.0.1.tgz",
|
| 1301 |
+
"integrity": "sha512-wR/XK3D3XcLIZwpbvQwQ5fK+8Ykds1ip7A2Txe0yxncXSdq1L9skcG7blcedkOX+ZcgxGAmLX1FrRGbADwzi0w==",
|
| 1302 |
+
"license": "BSD-3-Clause",
|
| 1303 |
+
"engines": {
|
| 1304 |
+
"node": ">=12"
|
| 1305 |
+
}
|
| 1306 |
+
},
|
| 1307 |
+
"node_modules/d3-format": {
|
| 1308 |
+
"version": "3.1.2",
|
| 1309 |
+
"resolved": "https://registry.npmjs.org/d3-format/-/d3-format-3.1.2.tgz",
|
| 1310 |
+
"integrity": "sha512-AJDdYOdnyRDV5b6ArilzCPPwc1ejkHcoyFarqlPqT7zRYjhavcT3uSrqcMvsgh2CgoPbK3RCwyHaVyxYcP2Arg==",
|
| 1311 |
+
"license": "ISC",
|
| 1312 |
+
"engines": {
|
| 1313 |
+
"node": ">=12"
|
| 1314 |
+
}
|
| 1315 |
+
},
|
| 1316 |
+
"node_modules/d3-interpolate": {
|
| 1317 |
+
"version": "3.0.1",
|
| 1318 |
+
"resolved": "https://registry.npmjs.org/d3-interpolate/-/d3-interpolate-3.0.1.tgz",
|
| 1319 |
+
"integrity": "sha512-3bYs1rOD33uo8aqJfKP3JWPAibgw8Zm2+L9vBKEHJ2Rg+viTR7o5Mmv5mZcieN+FRYaAOWX5SJATX6k1PWz72g==",
|
| 1320 |
+
"license": "ISC",
|
| 1321 |
+
"dependencies": {
|
| 1322 |
+
"d3-color": "1 - 3"
|
| 1323 |
+
},
|
| 1324 |
+
"engines": {
|
| 1325 |
+
"node": ">=12"
|
| 1326 |
+
}
|
| 1327 |
+
},
|
| 1328 |
+
"node_modules/d3-path": {
|
| 1329 |
+
"version": "3.1.0",
|
| 1330 |
+
"resolved": "https://registry.npmjs.org/d3-path/-/d3-path-3.1.0.tgz",
|
| 1331 |
+
"integrity": "sha512-p3KP5HCf/bvjBSSKuXid6Zqijx7wIfNW+J/maPs+iwR35at5JCbLUT0LzF1cnjbCHWhqzQTIN2Jpe8pRebIEFQ==",
|
| 1332 |
+
"license": "ISC",
|
| 1333 |
+
"engines": {
|
| 1334 |
+
"node": ">=12"
|
| 1335 |
+
}
|
| 1336 |
+
},
|
| 1337 |
+
"node_modules/d3-scale": {
|
| 1338 |
+
"version": "4.0.2",
|
| 1339 |
+
"resolved": "https://registry.npmjs.org/d3-scale/-/d3-scale-4.0.2.tgz",
|
| 1340 |
+
"integrity": "sha512-GZW464g1SH7ag3Y7hXjf8RoUuAFIqklOAq3MRl4OaWabTFJY9PN/E1YklhXLh+OQ3fM9yS2nOkCoS+WLZ6kvxQ==",
|
| 1341 |
+
"license": "ISC",
|
| 1342 |
+
"dependencies": {
|
| 1343 |
+
"d3-array": "2.10.0 - 3",
|
| 1344 |
+
"d3-format": "1 - 3",
|
| 1345 |
+
"d3-interpolate": "1.2.0 - 3",
|
| 1346 |
+
"d3-time": "2.1.1 - 3",
|
| 1347 |
+
"d3-time-format": "2 - 4"
|
| 1348 |
+
},
|
| 1349 |
+
"engines": {
|
| 1350 |
+
"node": ">=12"
|
| 1351 |
+
}
|
| 1352 |
+
},
|
| 1353 |
+
"node_modules/d3-shape": {
|
| 1354 |
+
"version": "3.2.0",
|
| 1355 |
+
"resolved": "https://registry.npmjs.org/d3-shape/-/d3-shape-3.2.0.tgz",
|
| 1356 |
+
"integrity": "sha512-SaLBuwGm3MOViRq2ABk3eLoxwZELpH6zhl3FbAoJ7Vm1gofKx6El1Ib5z23NUEhF9AsGl7y+dzLe5Cw2AArGTA==",
|
| 1357 |
+
"license": "ISC",
|
| 1358 |
+
"dependencies": {
|
| 1359 |
+
"d3-path": "^3.1.0"
|
| 1360 |
+
},
|
| 1361 |
+
"engines": {
|
| 1362 |
+
"node": ">=12"
|
| 1363 |
+
}
|
| 1364 |
+
},
|
| 1365 |
+
"node_modules/d3-time": {
|
| 1366 |
+
"version": "3.1.0",
|
| 1367 |
+
"resolved": "https://registry.npmjs.org/d3-time/-/d3-time-3.1.0.tgz",
|
| 1368 |
+
"integrity": "sha512-VqKjzBLejbSMT4IgbmVgDjpkYrNWUYJnbCGo874u7MMKIWsILRX+OpX/gTk8MqjpT1A/c6HY2dCA77ZN0lkQ2Q==",
|
| 1369 |
+
"license": "ISC",
|
| 1370 |
+
"dependencies": {
|
| 1371 |
+
"d3-array": "2 - 3"
|
| 1372 |
+
},
|
| 1373 |
+
"engines": {
|
| 1374 |
+
"node": ">=12"
|
| 1375 |
+
}
|
| 1376 |
+
},
|
| 1377 |
+
"node_modules/d3-time-format": {
|
| 1378 |
+
"version": "4.1.0",
|
| 1379 |
+
"resolved": "https://registry.npmjs.org/d3-time-format/-/d3-time-format-4.1.0.tgz",
|
| 1380 |
+
"integrity": "sha512-dJxPBlzC7NugB2PDLwo9Q8JiTR3M3e4/XANkreKSUxF8vvXKqm1Yfq4Q5dl8budlunRVlUUaDUgFt7eA8D6NLg==",
|
| 1381 |
+
"license": "ISC",
|
| 1382 |
+
"dependencies": {
|
| 1383 |
+
"d3-time": "1 - 3"
|
| 1384 |
+
},
|
| 1385 |
+
"engines": {
|
| 1386 |
+
"node": ">=12"
|
| 1387 |
+
}
|
| 1388 |
+
},
|
| 1389 |
+
"node_modules/d3-timer": {
|
| 1390 |
+
"version": "3.0.1",
|
| 1391 |
+
"resolved": "https://registry.npmjs.org/d3-timer/-/d3-timer-3.0.1.tgz",
|
| 1392 |
+
"integrity": "sha512-ndfJ/JxxMd3nw31uyKoY2naivF+r29V+Lc0svZxe1JvvIRmi8hUsrMvdOwgS1o6uBHmiz91geQ0ylPP0aj1VUA==",
|
| 1393 |
+
"license": "ISC",
|
| 1394 |
+
"engines": {
|
| 1395 |
+
"node": ">=12"
|
| 1396 |
+
}
|
| 1397 |
+
},
|
| 1398 |
"node_modules/debug": {
|
| 1399 |
"version": "4.4.3",
|
| 1400 |
"resolved": "https://registry.npmjs.org/debug/-/debug-4.4.3.tgz",
|
|
|
|
| 1413 |
}
|
| 1414 |
}
|
| 1415 |
},
|
| 1416 |
+
"node_modules/decimal.js-light": {
|
| 1417 |
+
"version": "2.5.1",
|
| 1418 |
+
"resolved": "https://registry.npmjs.org/decimal.js-light/-/decimal.js-light-2.5.1.tgz",
|
| 1419 |
+
"integrity": "sha512-qIMFpTMZmny+MMIitAB6D7iVPEorVw6YQRWkvarTkT4tBeSLLiHzcwj6q0MmYSFCiVpiqPJTJEYIrpcPzVEIvg==",
|
| 1420 |
+
"license": "MIT"
|
| 1421 |
+
},
|
| 1422 |
"node_modules/deep-is": {
|
| 1423 |
"version": "0.1.4",
|
| 1424 |
"resolved": "https://registry.npmjs.org/deep-is/-/deep-is-0.1.4.tgz",
|
|
|
|
| 1443 |
"dev": true,
|
| 1444 |
"license": "ISC"
|
| 1445 |
},
|
| 1446 |
+
"node_modules/es-toolkit": {
|
| 1447 |
+
"version": "1.46.0",
|
| 1448 |
+
"resolved": "https://registry.npmjs.org/es-toolkit/-/es-toolkit-1.46.0.tgz",
|
| 1449 |
+
"integrity": "sha512-IToJ6ct9OLl5zz6WsC/1vZEwfSZ7Myil+ygl5Tf30Xjn9AEkzNB4kqp2G7VUJKF1DtTx/ra5M5KLlXvzOg51BA==",
|
| 1450 |
+
"license": "MIT",
|
| 1451 |
+
"workspaces": [
|
| 1452 |
+
"docs",
|
| 1453 |
+
"benchmarks"
|
| 1454 |
+
]
|
| 1455 |
+
},
|
| 1456 |
"node_modules/escalade": {
|
| 1457 |
"version": "3.2.0",
|
| 1458 |
"resolved": "https://registry.npmjs.org/escalade/-/escalade-3.2.0.tgz",
|
|
|
|
| 1661 |
"node": ">=0.10.0"
|
| 1662 |
}
|
| 1663 |
},
|
| 1664 |
+
"node_modules/eventemitter3": {
|
| 1665 |
+
"version": "5.0.4",
|
| 1666 |
+
"resolved": "https://registry.npmjs.org/eventemitter3/-/eventemitter3-5.0.4.tgz",
|
| 1667 |
+
"integrity": "sha512-mlsTRyGaPBjPedk6Bvw+aqbsXDtoAyAzm5MO7JgU+yVRyMQ5O8bD4Kcci7BS85f93veegeCPkL8R4GLClnjLFw==",
|
| 1668 |
+
"license": "MIT"
|
| 1669 |
+
},
|
| 1670 |
"node_modules/fast-deep-equal": {
|
| 1671 |
"version": "3.1.3",
|
| 1672 |
"resolved": "https://registry.npmjs.org/fast-deep-equal/-/fast-deep-equal-3.1.3.tgz",
|
|
|
|
| 1845 |
"node": ">= 4"
|
| 1846 |
}
|
| 1847 |
},
|
| 1848 |
+
"node_modules/immer": {
|
| 1849 |
+
"version": "10.2.0",
|
| 1850 |
+
"resolved": "https://registry.npmjs.org/immer/-/immer-10.2.0.tgz",
|
| 1851 |
+
"integrity": "sha512-d/+XTN3zfODyjr89gM3mPq1WNX2B8pYsu7eORitdwyA2sBubnTl3laYlBk4sXY5FUa5qTZGBDPJICVbvqzjlbw==",
|
| 1852 |
+
"license": "MIT",
|
| 1853 |
+
"funding": {
|
| 1854 |
+
"type": "opencollective",
|
| 1855 |
+
"url": "https://opencollective.com/immer"
|
| 1856 |
+
}
|
| 1857 |
+
},
|
| 1858 |
"node_modules/import-fresh": {
|
| 1859 |
"version": "3.3.1",
|
| 1860 |
"resolved": "https://registry.npmjs.org/import-fresh/-/import-fresh-3.3.1.tgz",
|
|
|
|
| 1882 |
"node": ">=0.8.19"
|
| 1883 |
}
|
| 1884 |
},
|
| 1885 |
+
"node_modules/internmap": {
|
| 1886 |
+
"version": "2.0.3",
|
| 1887 |
+
"resolved": "https://registry.npmjs.org/internmap/-/internmap-2.0.3.tgz",
|
| 1888 |
+
"integrity": "sha512-5Hh7Y1wQbvY5ooGgPbDaL5iYLAPzMTUrjMulskHLH6wnv/A+1q5rgEaiuqEjB+oxGXIVZs1FF+R/KPN3ZSQYYg==",
|
| 1889 |
+
"license": "ISC",
|
| 1890 |
+
"engines": {
|
| 1891 |
+
"node": ">=12"
|
| 1892 |
+
}
|
| 1893 |
+
},
|
| 1894 |
"node_modules/is-extglob": {
|
| 1895 |
"version": "2.1.1",
|
| 1896 |
"resolved": "https://registry.npmjs.org/is-extglob/-/is-extglob-2.1.1.tgz",
|
|
|
|
| 2527 |
"resolved": "https://registry.npmjs.org/react-dom/-/react-dom-19.2.5.tgz",
|
| 2528 |
"integrity": "sha512-J5bAZz+DXMMwW/wV3xzKke59Af6CHY7G4uYLN1OvBcKEsWOs4pQExj86BBKamxl/Ik5bx9whOrvBlSDfWzgSag==",
|
| 2529 |
"license": "MIT",
|
| 2530 |
+
"peer": true,
|
| 2531 |
"dependencies": {
|
| 2532 |
"scheduler": "^0.27.0"
|
| 2533 |
},
|
|
|
|
| 2535 |
"react": "^19.2.5"
|
| 2536 |
}
|
| 2537 |
},
|
| 2538 |
+
"node_modules/react-is": {
|
| 2539 |
+
"version": "19.2.5",
|
| 2540 |
+
"resolved": "https://registry.npmjs.org/react-is/-/react-is-19.2.5.tgz",
|
| 2541 |
+
"integrity": "sha512-Dn0t8IQhCmeIT3wu+Apm1/YVsJXsGWi6k4sPdnBIdqMVtHtv0IGi6dcpNpNkNac0zB2uUAqNX3MHzN8c+z2rwQ==",
|
| 2542 |
+
"license": "MIT",
|
| 2543 |
+
"peer": true
|
| 2544 |
+
},
|
| 2545 |
+
"node_modules/react-redux": {
|
| 2546 |
+
"version": "9.2.0",
|
| 2547 |
+
"resolved": "https://registry.npmjs.org/react-redux/-/react-redux-9.2.0.tgz",
|
| 2548 |
+
"integrity": "sha512-ROY9fvHhwOD9ySfrF0wmvu//bKCQ6AeZZq1nJNtbDC+kk5DuSuNX/n6YWYF/SYy7bSba4D4FSz8DJeKY/S/r+g==",
|
| 2549 |
+
"license": "MIT",
|
| 2550 |
+
"peer": true,
|
| 2551 |
+
"dependencies": {
|
| 2552 |
+
"@types/use-sync-external-store": "^0.0.6",
|
| 2553 |
+
"use-sync-external-store": "^1.4.0"
|
| 2554 |
+
},
|
| 2555 |
+
"peerDependencies": {
|
| 2556 |
+
"@types/react": "^18.2.25 || ^19",
|
| 2557 |
+
"react": "^18.0 || ^19",
|
| 2558 |
+
"redux": "^5.0.0"
|
| 2559 |
+
},
|
| 2560 |
+
"peerDependenciesMeta": {
|
| 2561 |
+
"@types/react": {
|
| 2562 |
+
"optional": true
|
| 2563 |
+
},
|
| 2564 |
+
"redux": {
|
| 2565 |
+
"optional": true
|
| 2566 |
+
}
|
| 2567 |
+
}
|
| 2568 |
+
},
|
| 2569 |
+
"node_modules/recharts": {
|
| 2570 |
+
"version": "3.8.1",
|
| 2571 |
+
"resolved": "https://registry.npmjs.org/recharts/-/recharts-3.8.1.tgz",
|
| 2572 |
+
"integrity": "sha512-mwzmO1s9sFL0TduUpwndxCUNoXsBw3u3E/0+A+cLcrSfQitSG62L32N69GhqUrrT5qKcAE3pCGVINC6pqkBBQg==",
|
| 2573 |
+
"license": "MIT",
|
| 2574 |
+
"workspaces": [
|
| 2575 |
+
"www"
|
| 2576 |
+
],
|
| 2577 |
+
"dependencies": {
|
| 2578 |
+
"@reduxjs/toolkit": "^1.9.0 || 2.x.x",
|
| 2579 |
+
"clsx": "^2.1.1",
|
| 2580 |
+
"decimal.js-light": "^2.5.1",
|
| 2581 |
+
"es-toolkit": "^1.39.3",
|
| 2582 |
+
"eventemitter3": "^5.0.1",
|
| 2583 |
+
"immer": "^10.1.1",
|
| 2584 |
+
"react-redux": "8.x.x || 9.x.x",
|
| 2585 |
+
"reselect": "5.1.1",
|
| 2586 |
+
"tiny-invariant": "^1.3.3",
|
| 2587 |
+
"use-sync-external-store": "^1.2.2",
|
| 2588 |
+
"victory-vendor": "^37.0.2"
|
| 2589 |
+
},
|
| 2590 |
+
"engines": {
|
| 2591 |
+
"node": ">=18"
|
| 2592 |
+
},
|
| 2593 |
+
"peerDependencies": {
|
| 2594 |
+
"react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0",
|
| 2595 |
+
"react-dom": "^16.0.0 || ^17.0.0 || ^18.0.0 || ^19.0.0",
|
| 2596 |
+
"react-is": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0"
|
| 2597 |
+
}
|
| 2598 |
+
},
|
| 2599 |
+
"node_modules/redux": {
|
| 2600 |
+
"version": "5.0.1",
|
| 2601 |
+
"resolved": "https://registry.npmjs.org/redux/-/redux-5.0.1.tgz",
|
| 2602 |
+
"integrity": "sha512-M9/ELqF6fy8FwmkpnF0S3YKOqMyoWJ4+CS5Efg2ct3oY9daQvd/Pc71FpGZsVsbl3Cpb+IIcjBDUnnyBdQbq4w==",
|
| 2603 |
+
"license": "MIT",
|
| 2604 |
+
"peer": true
|
| 2605 |
+
},
|
| 2606 |
+
"node_modules/redux-thunk": {
|
| 2607 |
+
"version": "3.1.0",
|
| 2608 |
+
"resolved": "https://registry.npmjs.org/redux-thunk/-/redux-thunk-3.1.0.tgz",
|
| 2609 |
+
"integrity": "sha512-NW2r5T6ksUKXCabzhL9z+h206HQw/NJkcLm1GPImRQ8IzfXwRGqjVhKJGauHirT0DAuyy6hjdnMZaRoAcy0Klw==",
|
| 2610 |
+
"license": "MIT",
|
| 2611 |
+
"peerDependencies": {
|
| 2612 |
+
"redux": "^5.0.0"
|
| 2613 |
+
}
|
| 2614 |
+
},
|
| 2615 |
+
"node_modules/reselect": {
|
| 2616 |
+
"version": "5.1.1",
|
| 2617 |
+
"resolved": "https://registry.npmjs.org/reselect/-/reselect-5.1.1.tgz",
|
| 2618 |
+
"integrity": "sha512-K/BG6eIky/SBpzfHZv/dd+9JBFiS4SWV7FIujVyJRux6e45+73RaUHXLmIR1f7WOMaQ0U1km6qwklRQxpJJY0w==",
|
| 2619 |
+
"license": "MIT"
|
| 2620 |
+
},
|
| 2621 |
"node_modules/resolve-from": {
|
| 2622 |
"version": "4.0.0",
|
| 2623 |
"resolved": "https://registry.npmjs.org/resolve-from/-/resolve-from-4.0.0.tgz",
|
|
|
|
| 2744 |
"node": ">=8"
|
| 2745 |
}
|
| 2746 |
},
|
| 2747 |
+
"node_modules/tiny-invariant": {
|
| 2748 |
+
"version": "1.3.3",
|
| 2749 |
+
"resolved": "https://registry.npmjs.org/tiny-invariant/-/tiny-invariant-1.3.3.tgz",
|
| 2750 |
+
"integrity": "sha512-+FbBPE1o9QAYvviau/qC5SE3caw21q3xkvWKBtja5vgqOWIHHJ3ioaq1VPfn/Szqctz2bU/oYeKd9/z5BL+PVg==",
|
| 2751 |
+
"license": "MIT"
|
| 2752 |
+
},
|
| 2753 |
"node_modules/tinyglobby": {
|
| 2754 |
"version": "0.2.16",
|
| 2755 |
"resolved": "https://registry.npmjs.org/tinyglobby/-/tinyglobby-0.2.16.tgz",
|
|
|
|
| 2829 |
"punycode": "^2.1.0"
|
| 2830 |
}
|
| 2831 |
},
|
| 2832 |
+
"node_modules/use-sync-external-store": {
|
| 2833 |
+
"version": "1.6.0",
|
| 2834 |
+
"resolved": "https://registry.npmjs.org/use-sync-external-store/-/use-sync-external-store-1.6.0.tgz",
|
| 2835 |
+
"integrity": "sha512-Pp6GSwGP/NrPIrxVFAIkOQeyw8lFenOHijQWkUTrDvrF4ALqylP2C/KCkeS9dpUM3KvYRQhna5vt7IL95+ZQ9w==",
|
| 2836 |
+
"license": "MIT",
|
| 2837 |
+
"peerDependencies": {
|
| 2838 |
+
"react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0"
|
| 2839 |
+
}
|
| 2840 |
+
},
|
| 2841 |
+
"node_modules/victory-vendor": {
|
| 2842 |
+
"version": "37.3.6",
|
| 2843 |
+
"resolved": "https://registry.npmjs.org/victory-vendor/-/victory-vendor-37.3.6.tgz",
|
| 2844 |
+
"integrity": "sha512-SbPDPdDBYp+5MJHhBCAyI7wKM3d5ivekigc2Dk2s7pgbZ9wIgIBYGVw4zGHBml/qTFbexrofXW6Gu4noGxrOwQ==",
|
| 2845 |
+
"license": "MIT AND ISC",
|
| 2846 |
+
"dependencies": {
|
| 2847 |
+
"@types/d3-array": "^3.0.3",
|
| 2848 |
+
"@types/d3-ease": "^3.0.0",
|
| 2849 |
+
"@types/d3-interpolate": "^3.0.1",
|
| 2850 |
+
"@types/d3-scale": "^4.0.2",
|
| 2851 |
+
"@types/d3-shape": "^3.1.0",
|
| 2852 |
+
"@types/d3-time": "^3.0.0",
|
| 2853 |
+
"@types/d3-timer": "^3.0.0",
|
| 2854 |
+
"d3-array": "^3.1.6",
|
| 2855 |
+
"d3-ease": "^3.0.1",
|
| 2856 |
+
"d3-interpolate": "^3.0.1",
|
| 2857 |
+
"d3-scale": "^4.0.2",
|
| 2858 |
+
"d3-shape": "^3.1.0",
|
| 2859 |
+
"d3-time": "^3.0.0",
|
| 2860 |
+
"d3-timer": "^3.0.1"
|
| 2861 |
+
}
|
| 2862 |
+
},
|
| 2863 |
"node_modules/vite": {
|
| 2864 |
"version": "8.0.9",
|
| 2865 |
"resolved": "https://registry.npmjs.org/vite/-/vite-8.0.9.tgz",
|
frontend/package.json
CHANGED
|
@@ -11,7 +11,8 @@
|
|
| 11 |
},
|
| 12 |
"dependencies": {
|
| 13 |
"react": "^19.2.5",
|
| 14 |
-
"react-dom": "^19.2.5"
|
|
|
|
| 15 |
},
|
| 16 |
"devDependencies": {
|
| 17 |
"@eslint/js": "^9.39.4",
|
|
|
|
| 11 |
},
|
| 12 |
"dependencies": {
|
| 13 |
"react": "^19.2.5",
|
| 14 |
+
"react-dom": "^19.2.5",
|
| 15 |
+
"recharts": "^3.8.1"
|
| 16 |
},
|
| 17 |
"devDependencies": {
|
| 18 |
"@eslint/js": "^9.39.4",
|
frontend/src/CodeArenaRL.jsx
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
-
|
| 2 |
import React, { useState, useEffect, useRef, useCallback } from "react";
|
|
|
|
| 3 |
|
| 4 |
/* ─────────────────────────────────────────────
|
| 5 |
GOOGLE FONTS
|
|
@@ -129,6 +129,12 @@ const GlobalStyles = () => (
|
|
| 129 |
TASKS (mirrors server tasks — display only)
|
| 130 |
───────────────────────────────────────────── */
|
| 131 |
const TASKS = {
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 132 |
"easy-1": {
|
| 133 |
id: "easy-1", label: "Easy", name: "Fix average_list()", difficulty: "easy",
|
| 134 |
description: "Fix syntax errors: missing colon after def and uses length() instead of len().",
|
|
@@ -176,37 +182,26 @@ function AnsiLine({ text }) {
|
|
| 176 |
}
|
| 177 |
|
| 178 |
/* ─────────────────────────────────────────────
|
| 179 |
-
REWARD CHART
|
| 180 |
───────────────────────────────────────────── */
|
| 181 |
function RewardChart({ rewards }) {
|
| 182 |
-
const
|
| 183 |
-
|
| 184 |
-
|
| 185 |
-
|
| 186 |
-
r,
|
| 187 |
-
}));
|
| 188 |
-
const pathD = pts.length > 1 ? pts.reduce((a, p, i) => i === 0 ? `M${p.x},${p.y}` : a + ` L${p.x},${p.y}`, "") : "";
|
| 189 |
-
const areaD = pts.length > 1 ? `${pathD} L${pts[pts.length - 1].x},${H - PAD} L${pts[0].x},${H - PAD} Z` : "";
|
| 190 |
return (
|
| 191 |
-
<
|
| 192 |
-
<
|
| 193 |
-
<
|
| 194 |
-
<
|
| 195 |
-
<
|
| 196 |
-
|
| 197 |
-
|
| 198 |
-
|
| 199 |
-
|
| 200 |
-
|
| 201 |
-
|
| 202 |
-
|
| 203 |
-
<text key={s} x={PAD + ((s - 1) / 4) * (W - PAD * 2)} y={H - 4}
|
| 204 |
-
fill="#334155" fontSize="8" textAnchor="middle" fontFamily="JetBrains Mono">{s}</text>
|
| 205 |
-
))}
|
| 206 |
-
{areaD && <path d={areaD} fill="url(#rg)" />}
|
| 207 |
-
{pathD && <path d={pathD} fill="none" stroke="#00ff88" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round" />}
|
| 208 |
-
{pts.map((p, i) => <circle key={i} cx={p.x} cy={p.y} r="4" fill="#0a0e1a" stroke={rewardColor(p.r)} strokeWidth="2" />)}
|
| 209 |
-
</svg>
|
| 210 |
);
|
| 211 |
}
|
| 212 |
|
|
@@ -226,6 +221,9 @@ export default function CodeArenaRL() {
|
|
| 226 |
|
| 227 |
/* ── Task & episode state ── */
|
| 228 |
const [selectedTask, setSelectedTask] = useState("easy-1");
|
|
|
|
|
|
|
|
|
|
| 229 |
const [envState, setEnvState] = useState(null); // observation from server
|
| 230 |
const [uiMode, setUiMode] = useState("idle"); // idle|resetting|agent_thinking|executing|done
|
| 231 |
const [episodeLog, setEpisodeLog] = useState([]);
|
|
@@ -305,7 +303,7 @@ export default function CodeArenaRL() {
|
|
| 305 |
});
|
| 306 |
if (!res.ok) throw new Error(`/reset failed: ${res.status}`);
|
| 307 |
const data = await res.json();
|
| 308 |
-
return data
|
| 309 |
}, [envUrl]);
|
| 310 |
|
| 311 |
const envStep = useCallback(async (proposedFix) => {
|
|
@@ -463,6 +461,9 @@ export default function CodeArenaRL() {
|
|
| 463 |
setManualCode(""); setTokenEst(0);
|
| 464 |
setCollapsedEntries(new Set());
|
| 465 |
setErrorBanner("");
|
|
|
|
|
|
|
|
|
|
| 466 |
}, []);
|
| 467 |
|
| 468 |
/* ──────────────────────────────────────────
|
|
@@ -512,6 +513,7 @@ export default function CodeArenaRL() {
|
|
| 512 |
|
| 513 |
const { observation: newObs, reward, done } = stepResult;
|
| 514 |
const meta = stepResult.info?.execution_metadata || {};
|
|
|
|
| 515 |
const passed = meta.test_passed ?? 0;
|
| 516 |
const total = meta.test_total ?? task.hints.length + 1;
|
| 517 |
const newStep = currentStepCount + 1;
|
|
@@ -526,11 +528,19 @@ export default function CodeArenaRL() {
|
|
| 526 |
setStepCount(newStep);
|
| 527 |
setRewards(prev => [...prev, reward]);
|
| 528 |
setIsDone(done);
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 529 |
|
| 530 |
const logEntry = {
|
| 531 |
step: newStep,
|
| 532 |
code_submitted: fixedCode,
|
| 533 |
reward, done, passed, total,
|
|
|
|
|
|
|
|
|
|
| 534 |
error_log: newObs?.error_log || "",
|
| 535 |
test_results: newObs?.test_results || "",
|
| 536 |
timestamp: new Date().toISOString(),
|
|
@@ -570,9 +580,9 @@ export default function CodeArenaRL() {
|
|
| 570 |
runningRef.current = true;
|
| 571 |
setUiMode("resetting");
|
| 572 |
|
| 573 |
-
let
|
| 574 |
try {
|
| 575 |
-
|
| 576 |
} catch (err) {
|
| 577 |
setErrorBanner(`🌐 OpenEnv /reset Error: ${err.message}`);
|
| 578 |
setUiMode("idle");
|
|
@@ -580,6 +590,9 @@ export default function CodeArenaRL() {
|
|
| 580 |
return;
|
| 581 |
}
|
| 582 |
|
|
|
|
|
|
|
|
|
|
| 583 |
setEnvState(initialObs);
|
| 584 |
setTimeout(() => runStep(initialObs, 0), 400);
|
| 585 |
}, [ollamaStatus, envStatus, manualMode, resetEpisode, envReset, selectedTask, runStep]);
|
|
@@ -865,7 +878,14 @@ export default function CodeArenaRL() {
|
|
| 865 |
<div className="panel">
|
| 866 |
<div className="panel-header">
|
| 867 |
<span style={{ color: "#ff4455" }}>⚠</span> Buggy Code
|
| 868 |
-
<span style={{ marginLeft: "auto"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 869 |
</div>
|
| 870 |
<div style={{ padding: 14 }}>
|
| 871 |
<pre className="code-block" style={{ color: "#f8c8c8", maxHeight: 170, overflowY: "auto" }}>
|
|
@@ -1009,6 +1029,30 @@ export default function CodeArenaRL() {
|
|
| 1009 |
</div>
|
| 1010 |
)}
|
| 1011 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1012 |
{/* Episode Log */}
|
| 1013 |
<div className="panel" style={{ flex: 1, display: "flex", flexDirection: "column" }}>
|
| 1014 |
<div className="panel-header" style={{ justifyContent: "space-between" }}>
|
|
|
|
|
|
|
| 1 |
import React, { useState, useEffect, useRef, useCallback } from "react";
|
| 2 |
+
import { LineChart, Line, XAxis, YAxis, Tooltip, ResponsiveContainer, ReferenceLine } from "recharts";
|
| 3 |
|
| 4 |
/* ─────────────────────────────────────────────
|
| 5 |
GOOGLE FONTS
|
|
|
|
| 129 |
TASKS (mirrors server tasks — display only)
|
| 130 |
───────────────────────────────────────────── */
|
| 131 |
const TASKS = {
|
| 132 |
+
"auto": {
|
| 133 |
+
id: "auto", label: "Auto", name: "Adaptive Curriculum", difficulty: "info",
|
| 134 |
+
description: "Automatically selects difficulty based on recent performance history.",
|
| 135 |
+
hints: ["If avg < 0.4 -> Easy", "If avg < 0.75 -> Medium", "Else -> Hard"],
|
| 136 |
+
buggy_code: "# Click Start Episode to fetch task",
|
| 137 |
+
},
|
| 138 |
"easy-1": {
|
| 139 |
id: "easy-1", label: "Easy", name: "Fix average_list()", difficulty: "easy",
|
| 140 |
description: "Fix syntax errors: missing colon after def and uses length() instead of len().",
|
|
|
|
| 182 |
}
|
| 183 |
|
| 184 |
/* ─────────────────────────────────────────────
|
| 185 |
+
REWARD CHART (Recharts)
|
| 186 |
───────────────────────────────────────────── */
|
| 187 |
function RewardChart({ rewards }) {
|
| 188 |
+
const data = rewards.map((r, i) => ({ step: i + 1, reward: r }));
|
| 189 |
+
for (let i = data.length + 1; i <= 5; i++) {
|
| 190 |
+
data.push({ step: i, reward: null });
|
| 191 |
+
}
|
|
|
|
|
|
|
|
|
|
|
|
|
| 192 |
return (
|
| 193 |
+
<div style={{ width: "100%", height: 120 }}>
|
| 194 |
+
<ResponsiveContainer width="100%" height="100%">
|
| 195 |
+
<LineChart data={data} margin={{ top: 10, right: 10, left: -20, bottom: 0 }}>
|
| 196 |
+
<XAxis dataKey="step" stroke="#334155" tick={{ fill: "#334155", fontSize: 10, fontFamily: "'JetBrains Mono',monospace" }} />
|
| 197 |
+
<YAxis domain={[0, 1]} ticks={[0, 0.5, 1]} stroke="#334155" tick={{ fill: "#334155", fontSize: 10, fontFamily: "'JetBrains Mono',monospace" }} />
|
| 198 |
+
<ReferenceLine y={0.5} stroke="#334155" strokeDasharray="3 3" />
|
| 199 |
+
<ReferenceLine y={1.0} stroke="#334155" strokeDasharray="3 3" />
|
| 200 |
+
<Tooltip contentStyle={{ backgroundColor: "#0f172a", border: "1px solid #1e293b", borderRadius: 4, fontFamily: "'JetBrains Mono',monospace", fontSize: 10 }} itemStyle={{ color: "#00ff88" }} />
|
| 201 |
+
<Line type="monotone" dataKey="reward" stroke="#00ff88" strokeWidth={2} dot={{ fill: "#0a0e1a", stroke: "#00ff88", strokeWidth: 2, r: 4 }} isAnimationActive={true} />
|
| 202 |
+
</LineChart>
|
| 203 |
+
</ResponsiveContainer>
|
| 204 |
+
</div>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 205 |
);
|
| 206 |
}
|
| 207 |
|
|
|
|
| 221 |
|
| 222 |
/* ── Task & episode state ── */
|
| 223 |
const [selectedTask, setSelectedTask] = useState("easy-1");
|
| 224 |
+
const [currentEnvTask, setCurrentEnvTask] = useState("");
|
| 225 |
+
const [currentEnvDifficulty, setCurrentEnvDifficulty] = useState("");
|
| 226 |
+
const [currentRewardComponents, setCurrentRewardComponents] = useState({ compile_score: 0, test_ratio: 0, efficiency_score: 0 });
|
| 227 |
const [envState, setEnvState] = useState(null); // observation from server
|
| 228 |
const [uiMode, setUiMode] = useState("idle"); // idle|resetting|agent_thinking|executing|done
|
| 229 |
const [episodeLog, setEpisodeLog] = useState([]);
|
|
|
|
| 303 |
});
|
| 304 |
if (!res.ok) throw new Error(`/reset failed: ${res.status}`);
|
| 305 |
const data = await res.json();
|
| 306 |
+
return data; // { observation, info }
|
| 307 |
}, [envUrl]);
|
| 308 |
|
| 309 |
const envStep = useCallback(async (proposedFix) => {
|
|
|
|
| 461 |
setManualCode(""); setTokenEst(0);
|
| 462 |
setCollapsedEntries(new Set());
|
| 463 |
setErrorBanner("");
|
| 464 |
+
setCurrentEnvTask("");
|
| 465 |
+
setCurrentEnvDifficulty("");
|
| 466 |
+
setCurrentRewardComponents({ compile_score: 0, test_ratio: 0, efficiency_score: 0 });
|
| 467 |
}, []);
|
| 468 |
|
| 469 |
/* ──────────────────────────────────────────
|
|
|
|
| 513 |
|
| 514 |
const { observation: newObs, reward, done } = stepResult;
|
| 515 |
const meta = stepResult.info?.execution_metadata || {};
|
| 516 |
+
const rc = stepResult.info?.reward_components || {};
|
| 517 |
const passed = meta.test_passed ?? 0;
|
| 518 |
const total = meta.test_total ?? task.hints.length + 1;
|
| 519 |
const newStep = currentStepCount + 1;
|
|
|
|
| 528 |
setStepCount(newStep);
|
| 529 |
setRewards(prev => [...prev, reward]);
|
| 530 |
setIsDone(done);
|
| 531 |
+
setCurrentRewardComponents({
|
| 532 |
+
compile_score: rc.compile_score || 0,
|
| 533 |
+
test_ratio: rc.test_ratio || 0,
|
| 534 |
+
efficiency_score: rc.efficiency || 0,
|
| 535 |
+
});
|
| 536 |
|
| 537 |
const logEntry = {
|
| 538 |
step: newStep,
|
| 539 |
code_submitted: fixedCode,
|
| 540 |
reward, done, passed, total,
|
| 541 |
+
compile_score: rc.compile_score || 0,
|
| 542 |
+
test_ratio: rc.test_ratio || 0,
|
| 543 |
+
efficiency_score: rc.efficiency || 0,
|
| 544 |
error_log: newObs?.error_log || "",
|
| 545 |
test_results: newObs?.test_results || "",
|
| 546 |
timestamp: new Date().toISOString(),
|
|
|
|
| 580 |
runningRef.current = true;
|
| 581 |
setUiMode("resetting");
|
| 582 |
|
| 583 |
+
let initialResp;
|
| 584 |
try {
|
| 585 |
+
initialResp = await envReset(selectedTask);
|
| 586 |
} catch (err) {
|
| 587 |
setErrorBanner(`🌐 OpenEnv /reset Error: ${err.message}`);
|
| 588 |
setUiMode("idle");
|
|
|
|
| 590 |
return;
|
| 591 |
}
|
| 592 |
|
| 593 |
+
const initialObs = initialResp.observation;
|
| 594 |
+
setCurrentEnvTask(initialResp.info?.task_id || selectedTask);
|
| 595 |
+
setCurrentEnvDifficulty(initialResp.info?.difficulty || "");
|
| 596 |
setEnvState(initialObs);
|
| 597 |
setTimeout(() => runStep(initialObs, 0), 400);
|
| 598 |
}, [ollamaStatus, envStatus, manualMode, resetEpisode, envReset, selectedTask, runStep]);
|
|
|
|
| 878 |
<div className="panel">
|
| 879 |
<div className="panel-header">
|
| 880 |
<span style={{ color: "#ff4455" }}>⚠</span> Buggy Code
|
| 881 |
+
<span style={{ marginLeft: "auto", display: "flex", gap: 6 }}>
|
| 882 |
+
{currentEnvDifficulty && (
|
| 883 |
+
<span className={`badge badge-${currentEnvDifficulty.toLowerCase()}`}>
|
| 884 |
+
{currentEnvDifficulty}
|
| 885 |
+
</span>
|
| 886 |
+
)}
|
| 887 |
+
<span className={`badge badge-${task.difficulty}`}>{currentEnvTask || task.id}</span>
|
| 888 |
+
</span>
|
| 889 |
</div>
|
| 890 |
<div style={{ padding: 14 }}>
|
| 891 |
<pre className="code-block" style={{ color: "#f8c8c8", maxHeight: 170, overflowY: "auto" }}>
|
|
|
|
| 1029 |
</div>
|
| 1030 |
)}
|
| 1031 |
|
| 1032 |
+
{/* Live Reward Components */}
|
| 1033 |
+
{stepCount > 0 && (
|
| 1034 |
+
<div className="panel fade-in">
|
| 1035 |
+
<div className="panel-header">🏅 Reward Components</div>
|
| 1036 |
+
<div style={{ padding: "12px 14px", display: "flex", flexDirection: "column", gap: 12 }}>
|
| 1037 |
+
{[
|
| 1038 |
+
{ label: "Compile Score", val: currentRewardComponents.compile_score },
|
| 1039 |
+
{ label: "Test Pass Ratio", val: currentRewardComponents.test_ratio },
|
| 1040 |
+
{ label: "Efficiency", val: currentRewardComponents.efficiency_score },
|
| 1041 |
+
].map(c => (
|
| 1042 |
+
<div key={c.label}>
|
| 1043 |
+
<div style={{ display: "flex", justifyContent: "space-between", fontSize: 10, fontFamily: "'JetBrains Mono',monospace", color: "#64748b", marginBottom: 4 }}>
|
| 1044 |
+
<span>{c.label}</span>
|
| 1045 |
+
<span style={{ color: rewardColor(c.val) }}>{c.val.toFixed(2)}</span>
|
| 1046 |
+
</div>
|
| 1047 |
+
<div className="reward-bar-outer" style={{ marginTop: 0, height: 4 }}>
|
| 1048 |
+
<div className="reward-bar-inner" style={{ width: `${c.val * 100}%`, background: `linear-gradient(90deg, ${rewardColor(0)}, ${rewardColor(c.val)})` }} />
|
| 1049 |
+
</div>
|
| 1050 |
+
</div>
|
| 1051 |
+
))}
|
| 1052 |
+
</div>
|
| 1053 |
+
</div>
|
| 1054 |
+
)}
|
| 1055 |
+
|
| 1056 |
{/* Episode Log */}
|
| 1057 |
<div className="panel" style={{ flex: 1, display: "flex", flexDirection: "column" }}>
|
| 1058 |
<div className="panel-header" style={{ justifyContent: "space-between" }}>
|
inference.py
CHANGED
|
@@ -4,21 +4,27 @@ Rewritten for strict OpenEnv parsing.
|
|
| 4 |
"""
|
| 5 |
|
| 6 |
import os
|
|
|
|
| 7 |
import httpx
|
|
|
|
| 8 |
from openai import OpenAI
|
| 9 |
|
| 10 |
-
def run_task(task_id: str):
|
| 11 |
# Retrieve environment variables as instructed
|
| 12 |
base_url = os.environ.get("API_BASE_URL")
|
| 13 |
api_key = os.environ.get("HF_TOKEN") or os.environ.get("API_KEY")
|
| 14 |
model_name = os.environ.get("MODEL_NAME", "Qwen/Qwen2.5-72B-Instruct")
|
| 15 |
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
# 1. Print the [START] line
|
| 24 |
print(f"[START] task={task_id} env=codearena-rl-benchmark model={model_name}")
|
|
@@ -58,14 +64,19 @@ def run_task(task_id: str):
|
|
| 58 |
|
| 59 |
# 3b/c. Call the LLM
|
| 60 |
try:
|
| 61 |
-
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 69 |
except Exception as e:
|
| 70 |
error_msg = str(e).replace("\n", " ").replace("\r", "")
|
| 71 |
# If the LLM call fails, use this fallback fix
|
|
@@ -106,6 +117,22 @@ def run_task(task_id: str):
|
|
| 106 |
print(f"[STEP] step={step} action={action_summary} reward={reward:.2f} done={done_str} error={error_msg}")
|
| 107 |
|
| 108 |
# 4. Print [END]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 109 |
success = any(r > 0.5 for r in rewards)
|
| 110 |
success_str = "true" if success else "false"
|
| 111 |
rewards_str = ",".join([f"{r:.2f}" for r in rewards])
|
|
@@ -113,12 +140,16 @@ def run_task(task_id: str):
|
|
| 113 |
print(f"[END] success={success_str} steps={step} score={score:.2f} rewards={rewards_str}")
|
| 114 |
|
| 115 |
def main():
|
|
|
|
|
|
|
|
|
|
|
|
|
| 116 |
target_task = os.environ.get("CODEARENA_TASK")
|
| 117 |
if target_task:
|
| 118 |
-
run_task(target_task)
|
| 119 |
else:
|
| 120 |
for t in ["easy", "medium", "hard"]:
|
| 121 |
-
run_task(t)
|
| 122 |
|
| 123 |
if __name__ == "__main__":
|
| 124 |
main()
|
|
|
|
| 4 |
"""
|
| 5 |
|
| 6 |
import os
|
| 7 |
+
import argparse
|
| 8 |
import httpx
|
| 9 |
+
from datetime import datetime
|
| 10 |
from openai import OpenAI
|
| 11 |
|
| 12 |
+
def run_task(task_id: str, backend: str):
|
| 13 |
# Retrieve environment variables as instructed
|
| 14 |
base_url = os.environ.get("API_BASE_URL")
|
| 15 |
api_key = os.environ.get("HF_TOKEN") or os.environ.get("API_KEY")
|
| 16 |
model_name = os.environ.get("MODEL_NAME", "Qwen/Qwen2.5-72B-Instruct")
|
| 17 |
|
| 18 |
+
hf_pipeline = None
|
| 19 |
+
client = None
|
| 20 |
+
if backend == "hf":
|
| 21 |
+
from transformers import pipeline
|
| 22 |
+
hf_pipeline = pipeline("text-generation", model=model_name)
|
| 23 |
+
else:
|
| 24 |
+
client = OpenAI(
|
| 25 |
+
base_url=base_url,
|
| 26 |
+
api_key=api_key or "NO_KEY_PROVIDED"
|
| 27 |
+
)
|
| 28 |
|
| 29 |
# 1. Print the [START] line
|
| 30 |
print(f"[START] task={task_id} env=codearena-rl-benchmark model={model_name}")
|
|
|
|
| 64 |
|
| 65 |
# 3b/c. Call the LLM
|
| 66 |
try:
|
| 67 |
+
if backend == "hf":
|
| 68 |
+
prompt = f"{system_prompt}\n\n{user_prompt}"
|
| 69 |
+
output = hf_pipeline(prompt, max_new_tokens=512, return_full_text=False)
|
| 70 |
+
proposed_fix = output[0]["generated_text"]
|
| 71 |
+
else:
|
| 72 |
+
completion = client.chat.completions.create(
|
| 73 |
+
model=model_name,
|
| 74 |
+
messages=[
|
| 75 |
+
{"role": "system", "content": system_prompt},
|
| 76 |
+
{"role": "user", "content": user_prompt}
|
| 77 |
+
]
|
| 78 |
+
)
|
| 79 |
+
proposed_fix = completion.choices[0].message.content
|
| 80 |
except Exception as e:
|
| 81 |
error_msg = str(e).replace("\n", " ").replace("\r", "")
|
| 82 |
# If the LLM call fails, use this fallback fix
|
|
|
|
| 117 |
print(f"[STEP] step={step} action={action_summary} reward={reward:.2f} done={done_str} error={error_msg}")
|
| 118 |
|
| 119 |
# 4. Print [END]
|
| 120 |
+
timestamp = datetime.now().isoformat()
|
| 121 |
+
compile_score, test_ratio, efficiency_score = 0.0, 0.0, 0.0
|
| 122 |
+
if "info" in obs_json and "reward_components" in obs_json["info"]:
|
| 123 |
+
rc = obs_json["info"]["reward_components"]
|
| 124 |
+
compile_score = rc.get("compile_score", 0.0)
|
| 125 |
+
test_ratio = rc.get("test_ratio", 0.0)
|
| 126 |
+
efficiency_score = rc.get("efficiency", 0.0)
|
| 127 |
+
|
| 128 |
+
final_reward = rewards[-1] if rewards else 0.0
|
| 129 |
+
csv_path = "rewards_log.csv"
|
| 130 |
+
write_headers = not os.path.exists(csv_path)
|
| 131 |
+
with open(csv_path, "a", encoding="utf-8") as f:
|
| 132 |
+
if write_headers:
|
| 133 |
+
f.write("timestamp,task_id,step,reward,compile_score,test_ratio,efficiency_score\n")
|
| 134 |
+
f.write(f"{timestamp},{task_id},{step},{final_reward},{compile_score},{test_ratio},{efficiency_score}\n")
|
| 135 |
+
|
| 136 |
success = any(r > 0.5 for r in rewards)
|
| 137 |
success_str = "true" if success else "false"
|
| 138 |
rewards_str = ",".join([f"{r:.2f}" for r in rewards])
|
|
|
|
| 140 |
print(f"[END] success={success_str} steps={step} score={score:.2f} rewards={rewards_str}")
|
| 141 |
|
| 142 |
def main():
|
| 143 |
+
parser = argparse.ArgumentParser(description="CodeArena RL Inference")
|
| 144 |
+
parser.add_argument("--backend", type=str, choices=["openai", "hf"], default="openai", help="Backend to use for LLM generation.")
|
| 145 |
+
args = parser.parse_args()
|
| 146 |
+
|
| 147 |
target_task = os.environ.get("CODEARENA_TASK")
|
| 148 |
if target_task:
|
| 149 |
+
run_task(target_task, args.backend)
|
| 150 |
else:
|
| 151 |
for t in ["easy", "medium", "hard"]:
|
| 152 |
+
run_task(t, args.backend)
|
| 153 |
|
| 154 |
if __name__ == "__main__":
|
| 155 |
main()
|
openenv.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
name: codearena-rl-benchmark
|
| 2 |
description: "RL Benchmark for Autonomous Code Repair — iterative debugging with execution feedback"
|
| 3 |
version: "1.0.0"
|
| 4 |
-
entrypoint: server.
|
| 5 |
|
| 6 |
runtime:
|
| 7 |
language: python
|
|
@@ -12,6 +12,19 @@ api:
|
|
| 12 |
step: /step
|
| 13 |
state: /state
|
| 14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
tasks:
|
| 16 |
- id: easy
|
| 17 |
path: tasks/easy.json
|
|
@@ -22,6 +35,12 @@ tasks:
|
|
| 22 |
- id: hard
|
| 23 |
path: tasks/hard.json
|
| 24 |
grader: server.grader:grade
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
|
| 26 |
limits:
|
| 27 |
step_timeout_seconds: 2
|
|
|
|
| 1 |
name: codearena-rl-benchmark
|
| 2 |
description: "RL Benchmark for Autonomous Code Repair — iterative debugging with execution feedback"
|
| 3 |
version: "1.0.0"
|
| 4 |
+
entrypoint: server.app:CodeArenaEnv
|
| 5 |
|
| 6 |
runtime:
|
| 7 |
language: python
|
|
|
|
| 12 |
step: /step
|
| 13 |
state: /state
|
| 14 |
|
| 15 |
+
observation_space:
|
| 16 |
+
type: json
|
| 17 |
+
schema:
|
| 18 |
+
buggy_code: string
|
| 19 |
+
error_log: string
|
| 20 |
+
test_results: string
|
| 21 |
+
previous_attempts: list[string]
|
| 22 |
+
|
| 23 |
+
action_space:
|
| 24 |
+
type: json
|
| 25 |
+
schema:
|
| 26 |
+
proposed_fix: string
|
| 27 |
+
|
| 28 |
tasks:
|
| 29 |
- id: easy
|
| 30 |
path: tasks/easy.json
|
|
|
|
| 35 |
- id: hard
|
| 36 |
path: tasks/hard.json
|
| 37 |
grader: server.grader:grade
|
| 38 |
+
- id: type_errors
|
| 39 |
+
path: tasks/type_errors/type_error_1.json
|
| 40 |
+
grader: server.grader:grade
|
| 41 |
+
- id: security_bugs
|
| 42 |
+
path: tasks/security_bugs/security_bug_1.json
|
| 43 |
+
grader: server.grader:grade
|
| 44 |
|
| 45 |
limits:
|
| 46 |
step_timeout_seconds: 2
|
plot_rewards.py
ADDED
|
@@ -0,0 +1,53 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import pandas as pd
|
| 2 |
+
import matplotlib.pyplot as plt
|
| 3 |
+
import os
|
| 4 |
+
|
| 5 |
+
def main():
|
| 6 |
+
os.makedirs('results', exist_ok=True)
|
| 7 |
+
|
| 8 |
+
if not os.path.exists('rewards_log.csv'):
|
| 9 |
+
print("No rewards_log.csv found. Run inference first.")
|
| 10 |
+
return
|
| 11 |
+
|
| 12 |
+
try:
|
| 13 |
+
df = pd.read_csv('rewards_log.csv')
|
| 14 |
+
except Exception as e:
|
| 15 |
+
print(f"Error reading CSV: {e}")
|
| 16 |
+
return
|
| 17 |
+
|
| 18 |
+
if df.empty:
|
| 19 |
+
print("rewards_log.csv is empty.")
|
| 20 |
+
return
|
| 21 |
+
|
| 22 |
+
# Plot 1: Reward Curve over Training Steps (using index as training step)
|
| 23 |
+
plt.figure(figsize=(10, 6))
|
| 24 |
+
plt.plot(df.index, df['reward'], alpha=0.3, label='Episode Reward')
|
| 25 |
+
|
| 26 |
+
# 10-step rolling average
|
| 27 |
+
rolling_avg = df['reward'].rolling(window=10, min_periods=1).mean()
|
| 28 |
+
plt.plot(df.index, rolling_avg, color='red', linewidth=2, label='10-step Rolling Average')
|
| 29 |
+
|
| 30 |
+
plt.xlabel('Training Step')
|
| 31 |
+
plt.ylabel('Episode Reward (0-1)')
|
| 32 |
+
plt.title('Reward Curve')
|
| 33 |
+
plt.legend()
|
| 34 |
+
plt.grid(True, alpha=0.3)
|
| 35 |
+
plt.savefig('results/reward_curve.png')
|
| 36 |
+
plt.close()
|
| 37 |
+
|
| 38 |
+
# Plot 2: Average Reward per Task ID
|
| 39 |
+
plt.figure(figsize=(10, 6))
|
| 40 |
+
avg_per_task = df.groupby('task_id')['reward'].mean().sort_values()
|
| 41 |
+
avg_per_task.plot(kind='barh', color='skyblue')
|
| 42 |
+
plt.xlabel('Average Episode Reward (0-1)')
|
| 43 |
+
plt.ylabel('Task ID')
|
| 44 |
+
plt.title('Average Reward by Task ID')
|
| 45 |
+
plt.grid(axis='x', alpha=0.3)
|
| 46 |
+
plt.tight_layout()
|
| 47 |
+
plt.savefig('results/reward_by_task.png')
|
| 48 |
+
plt.close()
|
| 49 |
+
|
| 50 |
+
print("Plots saved to results/ directory.")
|
| 51 |
+
|
| 52 |
+
if __name__ == "__main__":
|
| 53 |
+
main()
|
server/app.py
CHANGED
|
@@ -42,8 +42,21 @@ class CodeArenaEnv:
|
|
| 42 |
self.is_done = False
|
| 43 |
self.step_count = 0
|
| 44 |
self.max_steps = 5
|
|
|
|
| 45 |
|
| 46 |
def reset(self, task_id: str = "easy") -> CodeArenaObservation:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
# Priority: exact task_id match → difficulty match → random
|
| 48 |
if task_id in TASK_ID_MAP:
|
| 49 |
self.current_task = TASK_ID_MAP[task_id]
|
|
@@ -71,7 +84,13 @@ class CodeArenaEnv:
|
|
| 71 |
timeout=max(self.current_task.optimal_time_seconds * 10, 2.0),
|
| 72 |
)
|
| 73 |
|
| 74 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 75 |
|
| 76 |
self.previous_attempts.append(action.proposed_fix)
|
| 77 |
self.last_error_log = exec_result.runtime_errors
|
|
@@ -79,14 +98,18 @@ class CodeArenaEnv:
|
|
| 79 |
f"{exec_result.test_passed}/{exec_result.test_total} tests passed."
|
| 80 |
)
|
| 81 |
|
| 82 |
-
if
|
| 83 |
self.is_done = True
|
|
|
|
|
|
|
|
|
|
| 84 |
|
| 85 |
info = {
|
| 86 |
"execution_metadata": exec_result.model_dump(),
|
| 87 |
"task_id": self.current_task.task_id,
|
|
|
|
| 88 |
}
|
| 89 |
-
return self._state(),
|
| 90 |
|
| 91 |
def _state(self) -> CodeArenaObservation:
|
| 92 |
if not self.current_task:
|
|
@@ -129,6 +152,10 @@ def api_reset(body: ResetRequest = ResetRequest()):
|
|
| 129 |
"status": "success",
|
| 130 |
"message": "Environment reset successfully",
|
| 131 |
"observation": obs.model_dump(),
|
|
|
|
|
|
|
|
|
|
|
|
|
| 132 |
}
|
| 133 |
except Exception:
|
| 134 |
traceback.print_exc()
|
|
|
|
| 42 |
self.is_done = False
|
| 43 |
self.step_count = 0
|
| 44 |
self.max_steps = 5
|
| 45 |
+
self.episode_rewards_history: list[float] = []
|
| 46 |
|
| 47 |
def reset(self, task_id: str = "easy") -> CodeArenaObservation:
|
| 48 |
+
if task_id == "auto":
|
| 49 |
+
if not self.episode_rewards_history:
|
| 50 |
+
task_id = "easy"
|
| 51 |
+
else:
|
| 52 |
+
avg_reward = sum(self.episode_rewards_history) / len(self.episode_rewards_history)
|
| 53 |
+
if avg_reward < 0.4:
|
| 54 |
+
task_id = "easy"
|
| 55 |
+
elif avg_reward <= 0.75:
|
| 56 |
+
task_id = "medium"
|
| 57 |
+
else:
|
| 58 |
+
task_id = "hard"
|
| 59 |
+
|
| 60 |
# Priority: exact task_id match → difficulty match → random
|
| 61 |
if task_id in TASK_ID_MAP:
|
| 62 |
self.current_task = TASK_ID_MAP[task_id]
|
|
|
|
| 84 |
timeout=max(self.current_task.optimal_time_seconds * 10, 2.0),
|
| 85 |
)
|
| 86 |
|
| 87 |
+
base_reward, reward_components = calculate_reward(exec_result, self.current_task, action.proposed_fix)
|
| 88 |
+
|
| 89 |
+
step_penalty = 0.02 * self.step_count
|
| 90 |
+
novelty_penalty = 0.1 if action.proposed_fix in self.previous_attempts else 0.0
|
| 91 |
+
|
| 92 |
+
final_reward = base_reward - step_penalty - novelty_penalty
|
| 93 |
+
final_reward = max(0.001, min(0.999, float(final_reward)))
|
| 94 |
|
| 95 |
self.previous_attempts.append(action.proposed_fix)
|
| 96 |
self.last_error_log = exec_result.runtime_errors
|
|
|
|
| 98 |
f"{exec_result.test_passed}/{exec_result.test_total} tests passed."
|
| 99 |
)
|
| 100 |
|
| 101 |
+
if final_reward > 0.99 or self.step_count >= self.max_steps:
|
| 102 |
self.is_done = True
|
| 103 |
+
self.episode_rewards_history.append(final_reward)
|
| 104 |
+
if len(self.episode_rewards_history) > 5:
|
| 105 |
+
self.episode_rewards_history.pop(0)
|
| 106 |
|
| 107 |
info = {
|
| 108 |
"execution_metadata": exec_result.model_dump(),
|
| 109 |
"task_id": self.current_task.task_id,
|
| 110 |
+
"reward_components": reward_components
|
| 111 |
}
|
| 112 |
+
return self._state(), final_reward, self.is_done, info
|
| 113 |
|
| 114 |
def _state(self) -> CodeArenaObservation:
|
| 115 |
if not self.current_task:
|
|
|
|
| 152 |
"status": "success",
|
| 153 |
"message": "Environment reset successfully",
|
| 154 |
"observation": obs.model_dump(),
|
| 155 |
+
"info": {
|
| 156 |
+
"task_id": _env.current_task.task_id if _env.current_task else "",
|
| 157 |
+
"difficulty": _env.current_task.difficulty if _env.current_task else ""
|
| 158 |
+
}
|
| 159 |
}
|
| 160 |
except Exception:
|
| 161 |
traceback.print_exc()
|
server/env.py
DELETED
|
@@ -1,116 +0,0 @@
|
|
| 1 |
-
import random
|
| 2 |
-
from fastapi import FastAPI, HTTPException
|
| 3 |
-
from contextlib import asynccontextmanager
|
| 4 |
-
|
| 5 |
-
from .models import CodeArenaObservation, CodeArenaAction, TaskInfo
|
| 6 |
-
from .executor import run_code_with_tests
|
| 7 |
-
from .grader import calculate_reward, safe_reward, force_valid_reward
|
| 8 |
-
from tasks import ALL_TASKS
|
| 9 |
-
|
| 10 |
-
class CodeArenaEnv:
|
| 11 |
-
def __init__(self):
|
| 12 |
-
self.tasks = ALL_TASKS
|
| 13 |
-
self.current_task: TaskInfo = None
|
| 14 |
-
self.previous_attempts = []
|
| 15 |
-
self.last_error_log = ""
|
| 16 |
-
self.last_test_results = ""
|
| 17 |
-
self.is_done = False
|
| 18 |
-
self.step_count = 0
|
| 19 |
-
self.max_steps = 5
|
| 20 |
-
|
| 21 |
-
def reset(self, task_id: str = None) -> CodeArenaObservation:
|
| 22 |
-
if task_id:
|
| 23 |
-
matched = [t for t in self.tasks if t.task_id == task_id]
|
| 24 |
-
self.current_task = matched[0] if matched else random.choice(self.tasks)
|
| 25 |
-
else:
|
| 26 |
-
self.current_task = random.choice(self.tasks)
|
| 27 |
-
self.previous_attempts = []
|
| 28 |
-
self.last_error_log = ""
|
| 29 |
-
self.last_test_results = ""
|
| 30 |
-
self.is_done = False
|
| 31 |
-
self.step_count = 0
|
| 32 |
-
return self.state()
|
| 33 |
-
|
| 34 |
-
def step(self, action: CodeArenaAction) -> tuple[CodeArenaObservation, float, bool, dict]:
|
| 35 |
-
if self.is_done:
|
| 36 |
-
raise ValueError("Environment is already done. Call reset().")
|
| 37 |
-
|
| 38 |
-
self.step_count += 1
|
| 39 |
-
|
| 40 |
-
# Execute the proposed fix with 10x optimal time as a hard timeout limit
|
| 41 |
-
exec_result = run_code_with_tests(
|
| 42 |
-
code=action.proposed_fix,
|
| 43 |
-
test_code=self.current_task.test_code,
|
| 44 |
-
timeout=max(self.current_task.optimal_time_seconds * 10, 2.0)
|
| 45 |
-
)
|
| 46 |
-
|
| 47 |
-
# Calculate Reward
|
| 48 |
-
reward = safe_reward(calculate_reward(exec_result, self.current_task))
|
| 49 |
-
reward = max(0.001, min(0.999, float(reward)))
|
| 50 |
-
|
| 51 |
-
# Update State
|
| 52 |
-
self.previous_attempts.append(action.proposed_fix)
|
| 53 |
-
self.last_error_log = exec_result.runtime_errors
|
| 54 |
-
self.last_test_results = f"{exec_result.test_passed}/{exec_result.test_total} tests passed."
|
| 55 |
-
|
| 56 |
-
# Check termination condition
|
| 57 |
-
if reward > 0.99 or self.step_count >= self.max_steps:
|
| 58 |
-
self.is_done = True
|
| 59 |
-
|
| 60 |
-
info = {
|
| 61 |
-
"execution_metadata": exec_result.model_dump(),
|
| 62 |
-
"task_id": self.current_task.task_id
|
| 63 |
-
}
|
| 64 |
-
|
| 65 |
-
return self.state(), reward, self.is_done, info
|
| 66 |
-
|
| 67 |
-
def state(self) -> CodeArenaObservation:
|
| 68 |
-
if not self.current_task:
|
| 69 |
-
raise ValueError("Environment not initialized. Call reset() first.")
|
| 70 |
-
|
| 71 |
-
return CodeArenaObservation(
|
| 72 |
-
buggy_code=self.current_task.buggy_code,
|
| 73 |
-
error_log=self.last_error_log,
|
| 74 |
-
test_results=self.last_test_results,
|
| 75 |
-
previous_attempts=self.previous_attempts,
|
| 76 |
-
)
|
| 77 |
-
|
| 78 |
-
# Initialize a global environment instance for the FastAPI wrapper
|
| 79 |
-
_env = CodeArenaEnv()
|
| 80 |
-
|
| 81 |
-
@asynccontextmanager
|
| 82 |
-
async def lifespan(app: FastAPI):
|
| 83 |
-
_env.reset()
|
| 84 |
-
yield
|
| 85 |
-
|
| 86 |
-
app = FastAPI(lifespan=lifespan, title="CodeArena RL Environment")
|
| 87 |
-
|
| 88 |
-
@app.post("/reset")
|
| 89 |
-
def api_reset(body: dict = None):
|
| 90 |
-
task_id = (body or {}).get("task_id")
|
| 91 |
-
obs = _env.reset(task_id=task_id)
|
| 92 |
-
return {"message": "Environment reset successfully", "observation": obs.model_dump()}
|
| 93 |
-
|
| 94 |
-
@app.post("/step")
|
| 95 |
-
def api_step(action: CodeArenaAction):
|
| 96 |
-
try:
|
| 97 |
-
obs, reward, done, info = _env.step(action)
|
| 98 |
-
# Safety fallback before force_valid_reward
|
| 99 |
-
if reward is None:
|
| 100 |
-
reward = 0.5
|
| 101 |
-
return {
|
| 102 |
-
"observation": obs.model_dump(),
|
| 103 |
-
"reward": force_valid_reward(reward),
|
| 104 |
-
"done": done,
|
| 105 |
-
"info": info
|
| 106 |
-
}
|
| 107 |
-
except ValueError as e:
|
| 108 |
-
raise HTTPException(status_code=400, detail=str(e))
|
| 109 |
-
|
| 110 |
-
@app.get("/state")
|
| 111 |
-
def api_state():
|
| 112 |
-
try:
|
| 113 |
-
obs = _env.state()
|
| 114 |
-
return {"observation": obs.model_dump()}
|
| 115 |
-
except ValueError as e:
|
| 116 |
-
raise HTTPException(status_code=400, detail=str(e))
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
server/grader.py
CHANGED
|
@@ -1,6 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
| 1 |
from .models import ExecutionResult, TaskInfo
|
| 2 |
|
| 3 |
-
|
| 4 |
def force_valid_reward(value) -> float:
|
| 5 |
"""Hard guarantee: reward is strictly in (0, 1) — never 0 or 1, no exceptions."""
|
| 6 |
try:
|
|
@@ -16,30 +18,85 @@ def force_valid_reward(value) -> float:
|
|
| 16 |
|
| 17 |
return r
|
| 18 |
|
| 19 |
-
|
| 20 |
def safe_reward(reward) -> float:
|
| 21 |
"""Clamp reward to open interval (0, 1) via force_valid_reward."""
|
| 22 |
if reward is None:
|
| 23 |
reward = 0.5
|
| 24 |
return force_valid_reward(reward)
|
| 25 |
|
| 26 |
-
|
| 27 |
def normalize_reward(passed: int, total: int) -> float:
|
| 28 |
if total == 0:
|
| 29 |
return 0.5
|
| 30 |
raw = passed / total
|
| 31 |
return force_valid_reward(raw)
|
| 32 |
|
|
|
|
| 33 |
|
| 34 |
-
def
|
| 35 |
-
|
| 36 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
|
| 39 |
def grade(*args, **kwargs) -> float:
|
| 40 |
try:
|
| 41 |
-
if len(args) ==
|
| 42 |
-
return calculate_reward(args[0], args[1])
|
| 43 |
return 0.5
|
| 44 |
except Exception:
|
| 45 |
return 0.5
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
import json
|
| 3 |
+
from openai import OpenAI
|
| 4 |
from .models import ExecutionResult, TaskInfo
|
| 5 |
|
|
|
|
| 6 |
def force_valid_reward(value) -> float:
|
| 7 |
"""Hard guarantee: reward is strictly in (0, 1) — never 0 or 1, no exceptions."""
|
| 8 |
try:
|
|
|
|
| 18 |
|
| 19 |
return r
|
| 20 |
|
|
|
|
| 21 |
def safe_reward(reward) -> float:
|
| 22 |
"""Clamp reward to open interval (0, 1) via force_valid_reward."""
|
| 23 |
if reward is None:
|
| 24 |
reward = 0.5
|
| 25 |
return force_valid_reward(reward)
|
| 26 |
|
|
|
|
| 27 |
def normalize_reward(passed: int, total: int) -> float:
|
| 28 |
if total == 0:
|
| 29 |
return 0.5
|
| 30 |
raw = passed / total
|
| 31 |
return force_valid_reward(raw)
|
| 32 |
|
| 33 |
+
_LLM_CACHE = {}
|
| 34 |
|
| 35 |
+
def get_llm_quality_score(proposed_fix: str) -> dict:
|
| 36 |
+
if proposed_fix in _LLM_CACHE:
|
| 37 |
+
return _LLM_CACHE[proposed_fix]
|
| 38 |
+
|
| 39 |
+
try:
|
| 40 |
+
client = OpenAI()
|
| 41 |
+
response = client.chat.completions.create(
|
| 42 |
+
model=os.environ.get("JUDGE_MODEL", "gpt-4o-mini"),
|
| 43 |
+
messages=[
|
| 44 |
+
{"role": "system", "content": "You are a code judge. Evaluate the provided Python code on a scale of 0.0 to 1.0 for three metrics: code_quality, security, and correctness. Respond with JSON format strictly matching: {\"code_quality\": 0.0, \"security\": 0.0, \"correctness\": 0.0}"},
|
| 45 |
+
{"role": "user", "content": proposed_fix}
|
| 46 |
+
],
|
| 47 |
+
response_format={"type": "json_object"}
|
| 48 |
+
)
|
| 49 |
+
result = json.loads(response.choices[0].message.content)
|
| 50 |
+
_LLM_CACHE[proposed_fix] = result
|
| 51 |
+
return result
|
| 52 |
+
except Exception as e:
|
| 53 |
+
print(f"LLM judge error: {e}")
|
| 54 |
+
fallback = {"code_quality": 0.5, "security": 0.5, "correctness": 0.5}
|
| 55 |
+
_LLM_CACHE[proposed_fix] = fallback
|
| 56 |
+
return fallback
|
| 57 |
+
|
| 58 |
+
def calculate_reward_components(exec_result: ExecutionResult, task_info: TaskInfo, proposed_fix: str) -> dict:
|
| 59 |
+
compile_score = 1.0 if not exec_result.runtime_errors else 0.0
|
| 60 |
+
|
| 61 |
+
test_ratio = 0.0
|
| 62 |
+
if exec_result.test_total > 0:
|
| 63 |
+
test_ratio = exec_result.test_passed / exec_result.test_total
|
| 64 |
+
|
| 65 |
+
efficiency = 0.0
|
| 66 |
+
if test_ratio == 1.0:
|
| 67 |
+
if exec_result.execution_time_seconds <= task_info.optimal_time_seconds:
|
| 68 |
+
efficiency = 1.0
|
| 69 |
+
else:
|
| 70 |
+
ratio = exec_result.execution_time_seconds / max(0.001, task_info.optimal_time_seconds)
|
| 71 |
+
efficiency = max(0.0, 1.0 - (ratio - 1.0) / 2.0)
|
| 72 |
+
|
| 73 |
+
llm_scores = get_llm_quality_score(proposed_fix)
|
| 74 |
+
|
| 75 |
+
return {
|
| 76 |
+
"compile_score": compile_score,
|
| 77 |
+
"test_ratio": test_ratio,
|
| 78 |
+
"efficiency": efficiency,
|
| 79 |
+
"llm_correctness": float(llm_scores.get("correctness", 0.5)),
|
| 80 |
+
"llm_security": float(llm_scores.get("security", 0.5)),
|
| 81 |
+
"llm_quality": float(llm_scores.get("code_quality", 0.5))
|
| 82 |
+
}
|
| 83 |
|
| 84 |
+
def calculate_reward(exec_result: ExecutionResult, task_info: TaskInfo, proposed_fix: str) -> tuple[float, dict]:
|
| 85 |
+
comps = calculate_reward_components(exec_result, task_info, proposed_fix)
|
| 86 |
+
base_reward = (
|
| 87 |
+
0.25 * comps["compile_score"] +
|
| 88 |
+
0.30 * comps["test_ratio"] +
|
| 89 |
+
0.15 * comps["efficiency"] +
|
| 90 |
+
0.15 * comps["llm_correctness"] +
|
| 91 |
+
0.10 * comps["llm_security"] +
|
| 92 |
+
0.05 * comps["llm_quality"]
|
| 93 |
+
)
|
| 94 |
+
return base_reward, comps
|
| 95 |
|
| 96 |
def grade(*args, **kwargs) -> float:
|
| 97 |
try:
|
| 98 |
+
if len(args) == 3:
|
| 99 |
+
return calculate_reward(args[0], args[1], args[2])[0]
|
| 100 |
return 0.5
|
| 101 |
except Exception:
|
| 102 |
return 0.5
|
tasks/__init__.py
CHANGED
|
@@ -1,5 +1,11 @@
|
|
| 1 |
from .easy import EASY_TASK
|
| 2 |
from .medium import MEDIUM_TASK
|
| 3 |
from .hard import HARD_TASK
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
| 5 |
-
ALL_TASKS = [EASY_TASK, MEDIUM_TASK, HARD_TASK]
|
|
|
|
| 1 |
from .easy import EASY_TASK
|
| 2 |
from .medium import MEDIUM_TASK
|
| 3 |
from .hard import HARD_TASK
|
| 4 |
+
from .type_errors.type_error_1 import TASK as TE1
|
| 5 |
+
from .type_errors.type_error_2 import TASK as TE2
|
| 6 |
+
from .type_errors.type_error_3 import TASK as TE3
|
| 7 |
+
from .security_bugs.security_bug_1 import TASK as SB1
|
| 8 |
+
from .security_bugs.security_bug_2 import TASK as SB2
|
| 9 |
+
from .security_bugs.security_bug_3 import TASK as SB3
|
| 10 |
|
| 11 |
+
ALL_TASKS = [EASY_TASK, MEDIUM_TASK, HARD_TASK, TE1, TE2, TE3, SB1, SB2, SB3]
|
tasks/security_bugs/security_bug_1.json
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"task_id": "security_bugs-1",
|
| 3 |
+
"difficulty": "security_bugs",
|
| 4 |
+
"description": "Fix the function to parse JSON safely without using eval().",
|
| 5 |
+
"buggy_code": "import json\ndef parse_user_data(data_string):\n return eval(data_string)",
|
| 6 |
+
"test_code": "\nimport unittest\nimport inspect\nclass TestSecurity1(unittest.TestCase):\n def test_normal(self):\n self.assertEqual(parse_user_data('{\"name\": \"alice\"}'), {\"name\": \"alice\"})\n def test_security(self):\n source = inspect.getsource(parse_user_data)\n self.assertNotIn(\"eval(\", source)\n",
|
| 7 |
+
"optimal_time_seconds": 0.05
|
| 8 |
+
}
|
tasks/security_bugs/security_bug_1.py
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from server.models import TaskInfo
|
| 2 |
+
|
| 3 |
+
TASK = TaskInfo(
|
| 4 |
+
task_id="security_bugs-1",
|
| 5 |
+
difficulty="security_bugs",
|
| 6 |
+
description="Fix the function to parse JSON safely without using eval().",
|
| 7 |
+
buggy_code="""import json
|
| 8 |
+
def parse_user_data(data_string):
|
| 9 |
+
return eval(data_string)""",
|
| 10 |
+
test_code="""
|
| 11 |
+
import unittest
|
| 12 |
+
import inspect
|
| 13 |
+
class TestSecurity1(unittest.TestCase):
|
| 14 |
+
def test_normal(self):
|
| 15 |
+
self.assertEqual(parse_user_data('{"name": "alice"}'), {"name": "alice"})
|
| 16 |
+
def test_security(self):
|
| 17 |
+
source = inspect.getsource(parse_user_data)
|
| 18 |
+
self.assertNotIn("eval(", source)
|
| 19 |
+
""",
|
| 20 |
+
optimal_time_seconds=0.05
|
| 21 |
+
)
|
tasks/security_bugs/security_bug_2.json
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"task_id": "security_bugs-2",
|
| 3 |
+
"difficulty": "security_bugs",
|
| 4 |
+
"description": "Remove the hardcoded secret token and load it from the os.environ dictionary as 'API_TOKEN'.",
|
| 5 |
+
"buggy_code": "import os\ndef get_api_token():\n token = \"secret_12345\"\n return token",
|
| 6 |
+
"test_code": "\nimport unittest\nimport inspect\nimport os\nclass TestSecurity2(unittest.TestCase):\n def test_normal(self):\n os.environ['API_TOKEN'] = 'my_secure_token'\n self.assertEqual(get_api_token(), 'my_secure_token')\n def test_security(self):\n source = inspect.getsource(get_api_token)\n self.assertNotIn(\"secret_12345\", source)\n",
|
| 7 |
+
"optimal_time_seconds": 0.05
|
| 8 |
+
}
|
tasks/security_bugs/security_bug_2.py
ADDED
|
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from server.models import TaskInfo
|
| 2 |
+
|
| 3 |
+
TASK = TaskInfo(
|
| 4 |
+
task_id="security_bugs-2",
|
| 5 |
+
difficulty="security_bugs",
|
| 6 |
+
description="Remove the hardcoded secret token and load it from the os.environ dictionary as 'API_TOKEN'.",
|
| 7 |
+
buggy_code="""import os
|
| 8 |
+
def get_api_token():
|
| 9 |
+
token = "secret_12345"
|
| 10 |
+
return token""",
|
| 11 |
+
test_code="""
|
| 12 |
+
import unittest
|
| 13 |
+
import inspect
|
| 14 |
+
import os
|
| 15 |
+
class TestSecurity2(unittest.TestCase):
|
| 16 |
+
def test_normal(self):
|
| 17 |
+
os.environ['API_TOKEN'] = 'my_secure_token'
|
| 18 |
+
self.assertEqual(get_api_token(), 'my_secure_token')
|
| 19 |
+
def test_security(self):
|
| 20 |
+
source = inspect.getsource(get_api_token)
|
| 21 |
+
self.assertNotIn("secret_12345", source)
|
| 22 |
+
""",
|
| 23 |
+
optimal_time_seconds=0.05
|
| 24 |
+
)
|
tasks/security_bugs/security_bug_3.json
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"task_id": "security_bugs-3",
|
| 3 |
+
"difficulty": "security_bugs",
|
| 4 |
+
"description": "Fix the ping command to avoid shell injection. Use a list of arguments and shell=False.",
|
| 5 |
+
"buggy_code": "import subprocess\ndef ping_host(host):\n return subprocess.check_output(f\"ping -c 1 {host}\", shell=True)",
|
| 6 |
+
"test_code": "\nimport unittest\nimport inspect\nclass TestSecurity3(unittest.TestCase):\n def test_security(self):\n source = inspect.getsource(ping_host)\n self.assertNotIn(\"shell=True\", source.replace(\" \", \"\"))\n self.assertIn(\"[\", source)\n",
|
| 7 |
+
"optimal_time_seconds": 0.05
|
| 8 |
+
}
|
tasks/security_bugs/security_bug_3.py
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from server.models import TaskInfo
|
| 2 |
+
|
| 3 |
+
TASK = TaskInfo(
|
| 4 |
+
task_id="security_bugs-3",
|
| 5 |
+
difficulty="security_bugs",
|
| 6 |
+
description="Fix the ping command to avoid shell injection. Use a list of arguments and shell=False.",
|
| 7 |
+
buggy_code="""import subprocess
|
| 8 |
+
def ping_host(host):
|
| 9 |
+
return subprocess.check_output(f"ping -c 1 {host}", shell=True)""",
|
| 10 |
+
test_code="""
|
| 11 |
+
import unittest
|
| 12 |
+
import inspect
|
| 13 |
+
class TestSecurity3(unittest.TestCase):
|
| 14 |
+
def test_security(self):
|
| 15 |
+
source = inspect.getsource(ping_host)
|
| 16 |
+
self.assertNotIn("shell=True", source.replace(" ", ""))
|
| 17 |
+
self.assertIn("[", source)
|
| 18 |
+
""",
|
| 19 |
+
optimal_time_seconds=0.05
|
| 20 |
+
)
|
tasks/type_errors/type_error_1.json
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"task_id": "type_errors-1",
|
| 3 |
+
"difficulty": "type_errors",
|
| 4 |
+
"description": "Fix the function to sum a list of numbers that might be passed as strings. It currently tries to add int and str.",
|
| 5 |
+
"buggy_code": "def sum_all(items):\n total = 0\n for item in items:\n total = total + item\n return total",
|
| 6 |
+
"test_code": "\nimport unittest\nclass TestTypeError1(unittest.TestCase):\n def test_normal(self):\n self.assertEqual(sum_all([1, 2, 3]), 6)\n def test_strings(self):\n self.assertEqual(sum_all(['1', '2', '3']), 6)\n def test_mixed(self):\n self.assertEqual(sum_all([1, '2', 3]), 6)\n",
|
| 7 |
+
"optimal_time_seconds": 0.05
|
| 8 |
+
}
|
tasks/type_errors/type_error_1.py
ADDED
|
@@ -0,0 +1,23 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from server.models import TaskInfo
|
| 2 |
+
|
| 3 |
+
TASK = TaskInfo(
|
| 4 |
+
task_id="type_errors-1",
|
| 5 |
+
difficulty="type_errors",
|
| 6 |
+
description="Fix the function to sum a list of numbers that might be passed as strings. It currently tries to add int and str.",
|
| 7 |
+
buggy_code="""def sum_all(items):
|
| 8 |
+
total = 0
|
| 9 |
+
for item in items:
|
| 10 |
+
total = total + item
|
| 11 |
+
return total""",
|
| 12 |
+
test_code="""
|
| 13 |
+
import unittest
|
| 14 |
+
class TestTypeError1(unittest.TestCase):
|
| 15 |
+
def test_normal(self):
|
| 16 |
+
self.assertEqual(sum_all([1, 2, 3]), 6)
|
| 17 |
+
def test_strings(self):
|
| 18 |
+
self.assertEqual(sum_all(['1', '2', '3']), 6)
|
| 19 |
+
def test_mixed(self):
|
| 20 |
+
self.assertEqual(sum_all([1, '2', 3]), 6)
|
| 21 |
+
""",
|
| 22 |
+
optimal_time_seconds=0.05
|
| 23 |
+
)
|
tasks/type_errors/type_error_2.json
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"task_id": "type_errors-2",
|
| 3 |
+
"difficulty": "type_errors",
|
| 4 |
+
"description": "Fix the function to count frequencies. It incorrectly calls .append() on a dict.",
|
| 5 |
+
"buggy_code": "def count_frequencies(words):\n counts = {}\n for word in words:\n if word not in counts:\n counts.append({word: 1})\n else:\n counts[word] += 1\n return counts",
|
| 6 |
+
"test_code": "\nimport unittest\nclass TestTypeError2(unittest.TestCase):\n def test_normal(self):\n self.assertEqual(count_frequencies(['apple', 'banana', 'apple']), {'apple': 2, 'banana': 1})\n def test_empty(self):\n self.assertEqual(count_frequencies([]), {})\n",
|
| 7 |
+
"optimal_time_seconds": 0.05
|
| 8 |
+
}
|
tasks/type_errors/type_error_2.py
ADDED
|
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from server.models import TaskInfo
|
| 2 |
+
|
| 3 |
+
TASK = TaskInfo(
|
| 4 |
+
task_id="type_errors-2",
|
| 5 |
+
difficulty="type_errors",
|
| 6 |
+
description="Fix the function to count frequencies. It incorrectly calls .append() on a dict.",
|
| 7 |
+
buggy_code="""def count_frequencies(words):
|
| 8 |
+
counts = {}
|
| 9 |
+
for word in words:
|
| 10 |
+
if word not in counts:
|
| 11 |
+
counts.append({word: 1})
|
| 12 |
+
else:
|
| 13 |
+
counts[word] += 1
|
| 14 |
+
return counts""",
|
| 15 |
+
test_code="""
|
| 16 |
+
import unittest
|
| 17 |
+
class TestTypeError2(unittest.TestCase):
|
| 18 |
+
def test_normal(self):
|
| 19 |
+
self.assertEqual(count_frequencies(['apple', 'banana', 'apple']), {'apple': 2, 'banana': 1})
|
| 20 |
+
def test_empty(self):
|
| 21 |
+
self.assertEqual(count_frequencies([]), {})
|
| 22 |
+
""",
|
| 23 |
+
optimal_time_seconds=0.05
|
| 24 |
+
)
|
tasks/type_errors/type_error_3.json
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"task_id": "type_errors-3",
|
| 3 |
+
"difficulty": "type_errors",
|
| 4 |
+
"description": "Fix the function to format names. It incorrectly calls .upper() on an int ID.",
|
| 5 |
+
"buggy_code": "def format_records(records):\n formatted = []\n for user_id, name in records:\n formatted.append(f\"{user_id.upper()} - {name.upper()}\")\n return formatted",
|
| 6 |
+
"test_code": "\nimport unittest\nclass TestTypeError3(unittest.TestCase):\n def test_normal(self):\n self.assertEqual(format_records([(1, 'alice'), (2, 'bob')]), ['1 - ALICE', '2 - BOB'])\n",
|
| 7 |
+
"optimal_time_seconds": 0.05
|
| 8 |
+
}
|
tasks/type_errors/type_error_3.py
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from server.models import TaskInfo
|
| 2 |
+
|
| 3 |
+
TASK = TaskInfo(
|
| 4 |
+
task_id="type_errors-3",
|
| 5 |
+
difficulty="type_errors",
|
| 6 |
+
description="Fix the function to format names. It incorrectly calls .upper() on an int ID.",
|
| 7 |
+
buggy_code="""def format_records(records):
|
| 8 |
+
formatted = []
|
| 9 |
+
for user_id, name in records:
|
| 10 |
+
formatted.append(f"{user_id.upper()} - {name.upper()}")
|
| 11 |
+
return formatted""",
|
| 12 |
+
test_code="""
|
| 13 |
+
import unittest
|
| 14 |
+
class TestTypeError3(unittest.TestCase):
|
| 15 |
+
def test_normal(self):
|
| 16 |
+
self.assertEqual(format_records([(1, 'alice'), (2, 'bob')]), ['1 - ALICE', '2 - BOB'])
|
| 17 |
+
""",
|
| 18 |
+
optimal_time_seconds=0.05
|
| 19 |
+
)
|
train_grpo.ipynb
ADDED
|
@@ -0,0 +1,138 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"cells": [
|
| 3 |
+
{
|
| 4 |
+
"cell_type": "markdown",
|
| 5 |
+
"metadata": {},
|
| 6 |
+
"source": [
|
| 7 |
+
"# GRPO Training with CodeArena RL Benchmark\n",
|
| 8 |
+
"\n",
|
| 9 |
+
"This notebook demonstrates how to connect our custom `codearena-rl-benchmark` environment to HuggingFace's `trl.GRPOTrainer`."
|
| 10 |
+
]
|
| 11 |
+
},
|
| 12 |
+
{
|
| 13 |
+
"cell_type": "code",
|
| 14 |
+
"execution_count": null,
|
| 15 |
+
"metadata": {},
|
| 16 |
+
"outputs": [],
|
| 17 |
+
"source": [
|
| 18 |
+
"!pip install trl transformers datasets openenv-py httpx\n",
|
| 19 |
+
"!git clone https://github.com/havinashpatil/meta.git\n",
|
| 20 |
+
"!cd meta && pip install -r requirements.txt"
|
| 21 |
+
]
|
| 22 |
+
},
|
| 23 |
+
{
|
| 24 |
+
"cell_type": "code",
|
| 25 |
+
"execution_count": null,
|
| 26 |
+
"metadata": {},
|
| 27 |
+
"outputs": [],
|
| 28 |
+
"source": [
|
| 29 |
+
"import torch\n",
|
| 30 |
+
"from datasets import Dataset\n",
|
| 31 |
+
"from transformers import AutoModelForCausalLM, AutoTokenizer\n",
|
| 32 |
+
"from trl import GRPOConfig, GRPOTrainer\n",
|
| 33 |
+
"import httpx\n",
|
| 34 |
+
"\n",
|
| 35 |
+
"# Start the backend server in the background (Colab trick)\n",
|
| 36 |
+
"import subprocess\n",
|
| 37 |
+
"import time\n",
|
| 38 |
+
"subprocess.Popen([\"uvicorn\", \"server.app:app\", \"--port\", \"7860\", \"--app-dir\", \"meta\"])\n",
|
| 39 |
+
"time.sleep(5) # Wait for server to start"
|
| 40 |
+
]
|
| 41 |
+
},
|
| 42 |
+
{
|
| 43 |
+
"cell_type": "code",
|
| 44 |
+
"execution_count": null,
|
| 45 |
+
"metadata": {},
|
| 46 |
+
"outputs": [],
|
| 47 |
+
"source": [
|
| 48 |
+
"def codearena_reward_func(completions, prompts):\n",
|
| 49 |
+
" \"\"\"\n",
|
| 50 |
+
" Reward function that queries the CodeArena OpenEnv server.\n",
|
| 51 |
+
" For each proposed fix in `completions`, we step the environment.\n",
|
| 52 |
+
" \"\"\"\n",
|
| 53 |
+
" rewards = []\n",
|
| 54 |
+
" for completion in completions:\n",
|
| 55 |
+
" # Clean the generated code\n",
|
| 56 |
+
" proposed_fix = completion[0].get('content', '').strip()\n",
|
| 57 |
+
" if proposed_fix.startswith('```python'):\n",
|
| 58 |
+
" proposed_fix = proposed_fix[9:].replace('```', '').strip()\n",
|
| 59 |
+
" \n",
|
| 60 |
+
" try:\n",
|
| 61 |
+
" # Step the environment\n",
|
| 62 |
+
" res = httpx.post(\n",
|
| 63 |
+
" \"http://localhost:7860/step\",\n",
|
| 64 |
+
" json={\"proposed_fix\": proposed_fix},\n",
|
| 65 |
+
" timeout=10.0\n",
|
| 66 |
+
" )\n",
|
| 67 |
+
" res.raise_for_status()\n",
|
| 68 |
+
" reward = res.json().get('reward', 0.0)\n",
|
| 69 |
+
" rewards.append(reward)\n",
|
| 70 |
+
" except Exception as e:\n",
|
| 71 |
+
" print(f\"Env Error: {e}\")\n",
|
| 72 |
+
" rewards.append(0.0)\n",
|
| 73 |
+
" \n",
|
| 74 |
+
" return rewards"
|
| 75 |
+
]
|
| 76 |
+
},
|
| 77 |
+
{
|
| 78 |
+
"cell_type": "code",
|
| 79 |
+
"execution_count": null,
|
| 80 |
+
"metadata": {},
|
| 81 |
+
"outputs": [],
|
| 82 |
+
"source": [
|
| 83 |
+
"# Load Model\n",
|
| 84 |
+
"model_name = \"Qwen/Qwen2.5-Coder-1.5B\"\n",
|
| 85 |
+
"model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map=\"auto\")\n",
|
| 86 |
+
"tokenizer = AutoTokenizer.from_pretrained(model_name)\n",
|
| 87 |
+
"tokenizer.pad_token = tokenizer.eos_token\n",
|
| 88 |
+
"\n",
|
| 89 |
+
"# Sample training dataset (prompts extracted from tasks)\n",
|
| 90 |
+
"# In a real setup, you'd reset the env for each prompt to get the initial buggy_code.\n",
|
| 91 |
+
"dataset = Dataset.from_dict({\n",
|
| 92 |
+
" \"prompt\": [\n",
|
| 93 |
+
" \"Fix this Python code:\\ndef average_list(numbers)\\n if length(numbers) == 0:\\n return 0\\n return sum(numbers) / length(numbers)\"\n",
|
| 94 |
+
" ]\n",
|
| 95 |
+
"})\n",
|
| 96 |
+
"\n",
|
| 97 |
+
"# Initialize GRPO Trainer\n",
|
| 98 |
+
"training_args = GRPOConfig(\n",
|
| 99 |
+
" output_dir=\"./codearena-grpo\",\n",
|
| 100 |
+
" learning_rate=1e-5,\n",
|
| 101 |
+
" max_steps=50,\n",
|
| 102 |
+
" per_device_train_batch_size=2,\n",
|
| 103 |
+
" gradient_accumulation_steps=2,\n",
|
| 104 |
+
")\n",
|
| 105 |
+
"\n",
|
| 106 |
+
"trainer = GRPOTrainer(\n",
|
| 107 |
+
" model=model,\n",
|
| 108 |
+
" reward_funcs=codearena_reward_func,\n",
|
| 109 |
+
" args=training_args,\n",
|
| 110 |
+
" train_dataset=dataset,\n",
|
| 111 |
+
")\n",
|
| 112 |
+
"\n",
|
| 113 |
+
"trainer.train()"
|
| 114 |
+
]
|
| 115 |
+
}
|
| 116 |
+
],
|
| 117 |
+
"metadata": {
|
| 118 |
+
"kernelspec": {
|
| 119 |
+
"display_name": "Python 3",
|
| 120 |
+
"language": "python",
|
| 121 |
+
"name": "python3"
|
| 122 |
+
},
|
| 123 |
+
"language_info": {
|
| 124 |
+
"codemirror_mode": {
|
| 125 |
+
"name": "ipython",
|
| 126 |
+
"version": 3
|
| 127 |
+
},
|
| 128 |
+
"file_extension": ".py",
|
| 129 |
+
"mimetype": "text/x-python",
|
| 130 |
+
"name": "python",
|
| 131 |
+
"nbconvert_exporter": "python",
|
| 132 |
+
"pygments_lexer": "ipython3",
|
| 133 |
+
"version": "3.10.12"
|
| 134 |
+
}
|
| 135 |
+
},
|
| 136 |
+
"nbformat": 4,
|
| 137 |
+
"nbformat_minor": 4
|
| 138 |
+
}
|