Spaces:
No application file
No application file
metadata
title: Code Debugging Challenge
emoji: π
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: apache-2.0
tags:
- openenv
- reinforcement-learning
- code-debugging
- agentic-ai
π Code Debugging Challenge - OpenEnv Environment
A production-ready OpenEnv environment where AI agents learn to debug Python code.
π― Overview
This environment challenges AI agents to identify and fix bugs in Python code snippets using the official OpenEnv framework from Meta-PyTorch and Hugging Face.
Key Features:
- β Built with official OpenEnv library
- β WebSocket-based client-server architecture
- β Docker containerized for isolation
- β Compatible with TRL, Torchforge, and other RL frameworks
- β Production-ready with proper session management
ποΈ Environment Details
- Action Space: 4 discrete actions (analyze, fix, test, submit)
- Observation Space: Structured observations with code, errors, and feedback
- Reward Structure:
- +1.0 for successful fix
- -0.2 to -0.5 for failed attempts
- +0.1 for analysis actions
- -1.0 for premature submission
- Episode Length: Max 5 attempts per bug
π Bug Types Included
- Argument Count Errors - Wrong number of function arguments
- Logic Errors - Incorrect loop variables and conditions
- Exception Handling - Missing error handling for edge cases
- Index Errors - Array/string index out of bounds
- Infinite Recursion - Recursive calls without base case reduction
- Type Errors - String/integer concatenation issues
- Key Errors - Missing dictionary keys
π Quick Start
Using Docker (Recommended)
from code_debug_env.client import DebugEnv
# Automatically starts Docker container and connects
env = DebugEnv.from_hub("openenv/code-debug-env")
# Reset to get first challenge
result = env.reset()
print(result.observation.buggy_code)
print(f"Expected output: {result.observation.expected_output}")
# Take action
from code_debug_env.models import DebugAction
action = DebugAction(action_type="test")
result = env.step(action)
print(f"Reward: {result.reward}")
# Cleanup
env.close()
π§ Integration with RL Frameworks
With TRL (Transformer Reinforcement Learning)
from trl import OnlineDPOConfig, OnlineDPOTrainer
from code_debug_env.client import DebugEnv
config = OnlineDPOConfig(...)
trainer = OnlineDPOTrainer(
config=config,
env=DebugEnv.from_hub("openenv/code-debug-env"),
# ... other args
)
trainer.train()
π OpenEnv Challenge Submission
This environment is submitted to the OpenEnv Challenge: SOTA Environments to Drive General Intelligence (UC Berkeley AgentBeats Competition).
π License
Apache 2.0