--- title: Code Debugging Challenge emoji: 🐛 colorFrom: blue colorTo: purple sdk: docker pinned: false license: apache-2.0 tags: - openenv - reinforcement-learning - code-debugging - agentic-ai --- # 🐛 Code Debugging Challenge - OpenEnv Environment A production-ready OpenEnv environment where AI agents learn to debug Python code. ## 🎯 Overview This environment challenges AI agents to identify and fix bugs in Python code snippets using the official **OpenEnv framework** from Meta-PyTorch and Hugging Face. **Key Features:** - ✅ Built with official OpenEnv library - ✅ WebSocket-based client-server architecture - ✅ Docker containerized for isolation - ✅ Compatible with TRL, Torchforge, and other RL frameworks - ✅ Production-ready with proper session management ## 🏗️ Environment Details - **Action Space**: 4 discrete actions (analyze, fix, test, submit) - **Observation Space**: Structured observations with code, errors, and feedback - **Reward Structure**: - +1.0 for successful fix - -0.2 to -0.5 for failed attempts - +0.1 for analysis actions - -1.0 for premature submission - **Episode Length**: Max 5 attempts per bug ## 🐞 Bug Types Included 1. **Argument Count Errors** - Wrong number of function arguments 2. **Logic Errors** - Incorrect loop variables and conditions 3. **Exception Handling** - Missing error handling for edge cases 4. **Index Errors** - Array/string index out of bounds 5. **Infinite Recursion** - Recursive calls without base case reduction 6. **Type Errors** - String/integer concatenation issues 7. **Key Errors** - Missing dictionary keys ## 🚀 Quick Start ### Using Docker (Recommended) ```python from code_debug_env.client import DebugEnv # Automatically starts Docker container and connects env = DebugEnv.from_hub("openenv/code-debug-env") # Reset to get first challenge result = env.reset() print(result.observation.buggy_code) print(f"Expected output: {result.observation.expected_output}") # Take action from code_debug_env.models import DebugAction action = DebugAction(action_type="test") result = env.step(action) print(f"Reward: {result.reward}") # Cleanup env.close() ``` ## 🔧 Integration with RL Frameworks ### With TRL (Transformer Reinforcement Learning) ```python from trl import OnlineDPOConfig, OnlineDPOTrainer from code_debug_env.client import DebugEnv config = OnlineDPOConfig(...) trainer = OnlineDPOTrainer( config=config, env=DebugEnv.from_hub("openenv/code-debug-env"), # ... other args ) trainer.train() ``` ## 🏆 OpenEnv Challenge Submission This environment is submitted to the **OpenEnv Challenge: SOTA Environments to Drive General Intelligence** (UC Berkeley AgentBeats Competition). ## 📜 License Apache 2.0