Spaces:
Sleeping
Sleeping
File size: 3,246 Bytes
b95e073 c14504c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 | ---
title: OpenEnv Code Debugger
emoji: π
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
app_port: 7860
---
# Code Debug OpenEnv
An **OpenEnv-compatible environment** for the Meta x PyTorch Hackathon where an AI agent debugs broken Python code.
## Overview
The agent receives buggy Python code and test descriptions, submits fixes, and is rewarded by the fraction of tests passing (0.0β1.0). The episode ends when all tests pass or the step limit is reached.
## Tasks
| Task | Difficulty | Bug Type |
|------|-----------|----------|
| task_001_off_by_one | Easy | Fibonacci returns wrong variable |
| task_002_wrong_operator | Easy | `<` instead of `>` in find_max |
| task_003_mutable_default | Medium | Mutable default argument in list builder |
| task_004_scope_bug | Medium | Closure captures loop variable by reference |
| task_005_binary_search | Hard | Binary search boundary bugs |
| task_006_graph_cycle | Hard | DFS cycle detection missing recursion stack |
## API Endpoints
| Method | Path | Description |
|--------|------|-------------|
| GET | `/health` | Health check |
| GET | `/tasks` | List all available tasks |
| POST | `/reset` | Start a new episode |
| POST | `/step/{episode_id}` | Submit fixed code |
| GET | `/state/{episode_id}` | Get episode metadata |
## Reward
```
reward = tests_passed / total_tests # range: 0.0 β 1.0
done = reward == 1.0 OR step_count >= max_steps
```
## Setup & Run
### Local (development)
```bash
pip install fastapi uvicorn pydantic httpx openai
cd Desktop/Meta
uvicorn code_debug_env.server.app:app --host 0.0.0.0 --port 7860 --reload
```
### Docker
```bash
cd Desktop/Meta/code_debug_env
docker build -t code-debug-env -f server/Dockerfile ..
docker run -p 7860:7860 code-debug-env
```
## Inference Script
```bash
export API_BASE_URL="https://router.huggingface.co/v1"
export MODEL_NAME="Qwen/Qwen2.5-72B-Instruct"
export HF_TOKEN="your_token"
export ENV_URL="http://localhost:7860" # or HF Space URL
python inference.py
```
### Expected output format
```
[START] task=task_001_off_by_one env=http://localhost:7860 model=Qwen/Qwen2.5-72B-Instruct
[STEP] step=1 action='def fib...' reward=1.00 done=true error=null
[END] success=true steps=1 score=1.000 rewards=1.00
```
## Environment Variables
| Variable | Required | Description |
|----------|----------|-------------|
| `API_BASE_URL` | Yes | LLM API endpoint |
| `MODEL_NAME` | Yes | Model identifier |
| `HF_TOKEN` | Yes | Hugging Face / API key |
| `ENV_URL` | No | OpenEnv server URL (default: http://localhost:7860) |
## Project Structure
```
code_debug_env/
βββ models.py # Pydantic models (DebugAction, DebugObservation, DebugState)
βββ client.py # HTTP client wrapper
βββ openenv.yaml # Environment manifest
βββ pyproject.toml # Package metadata
βββ tasks/ # Task definitions (JSON)
β βββ easy/
β βββ medium/
β βββ hard/
βββ server/
βββ environment.py # Core logic (reset/step/state)
βββ executor.py # Safe subprocess code runner
βββ app.py # FastAPI server
βββ Dockerfile
inference.py # Root-level inference script
```
|