File size: 2,768 Bytes
1e3b07a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
---
title: Code Debugging Challenge
emoji: πŸ›
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: apache-2.0
tags:
  - openenv
  - reinforcement-learning
  - code-debugging
  - agentic-ai
---

# πŸ› Code Debugging Challenge - OpenEnv Environment

A production-ready OpenEnv environment where AI agents learn to debug Python code.

## 🎯 Overview

This environment challenges AI agents to identify and fix bugs in Python code snippets using the official **OpenEnv framework** from Meta-PyTorch and Hugging Face.

**Key Features:**
- βœ… Built with official OpenEnv library
- βœ… WebSocket-based client-server architecture
- βœ… Docker containerized for isolation
- βœ… Compatible with TRL, Torchforge, and other RL frameworks
- βœ… Production-ready with proper session management

## πŸ—οΈ Environment Details

- **Action Space**: 4 discrete actions (analyze, fix, test, submit)
- **Observation Space**: Structured observations with code, errors, and feedback
- **Reward Structure**: 
  - +1.0 for successful fix
  - -0.2 to -0.5 for failed attempts
  - +0.1 for analysis actions
  - -1.0 for premature submission
- **Episode Length**: Max 5 attempts per bug

## 🐞 Bug Types Included

1. **Argument Count Errors** - Wrong number of function arguments
2. **Logic Errors** - Incorrect loop variables and conditions
3. **Exception Handling** - Missing error handling for edge cases
4. **Index Errors** - Array/string index out of bounds
5. **Infinite Recursion** - Recursive calls without base case reduction
6. **Type Errors** - String/integer concatenation issues
7. **Key Errors** - Missing dictionary keys

## πŸš€ Quick Start

### Using Docker (Recommended)

```python
from code_debug_env.client import DebugEnv

# Automatically starts Docker container and connects
env = DebugEnv.from_hub("openenv/code-debug-env")

# Reset to get first challenge
result = env.reset()
print(result.observation.buggy_code)
print(f"Expected output: {result.observation.expected_output}")

# Take action
from code_debug_env.models import DebugAction
action = DebugAction(action_type="test")
result = env.step(action)
print(f"Reward: {result.reward}")

# Cleanup
env.close()
```

## πŸ”§ Integration with RL Frameworks

### With TRL (Transformer Reinforcement Learning)

```python
from trl import OnlineDPOConfig, OnlineDPOTrainer
from code_debug_env.client import DebugEnv

config = OnlineDPOConfig(...)
trainer = OnlineDPOTrainer(
    config=config,
    env=DebugEnv.from_hub("openenv/code-debug-env"),
    # ... other args
)
trainer.train()
```

## πŸ† OpenEnv Challenge Submission

This environment is submitted to the **OpenEnv Challenge: SOTA Environments to Drive General Intelligence** (UC Berkeley AgentBeats Competition).

## πŸ“œ License

Apache 2.0