code-debug-env / README.md
Sneha Rudra
Initial commit: Code Debugging Challenge OpenEnv environment
1e3b07a
metadata
title: Code Debugging Challenge
emoji: πŸ›
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: apache-2.0
tags:
  - openenv
  - reinforcement-learning
  - code-debugging
  - agentic-ai

πŸ› Code Debugging Challenge - OpenEnv Environment

A production-ready OpenEnv environment where AI agents learn to debug Python code.

🎯 Overview

This environment challenges AI agents to identify and fix bugs in Python code snippets using the official OpenEnv framework from Meta-PyTorch and Hugging Face.

Key Features:

  • βœ… Built with official OpenEnv library
  • βœ… WebSocket-based client-server architecture
  • βœ… Docker containerized for isolation
  • βœ… Compatible with TRL, Torchforge, and other RL frameworks
  • βœ… Production-ready with proper session management

πŸ—οΈ Environment Details

  • Action Space: 4 discrete actions (analyze, fix, test, submit)
  • Observation Space: Structured observations with code, errors, and feedback
  • Reward Structure:
    • +1.0 for successful fix
    • -0.2 to -0.5 for failed attempts
    • +0.1 for analysis actions
    • -1.0 for premature submission
  • Episode Length: Max 5 attempts per bug

🐞 Bug Types Included

  1. Argument Count Errors - Wrong number of function arguments
  2. Logic Errors - Incorrect loop variables and conditions
  3. Exception Handling - Missing error handling for edge cases
  4. Index Errors - Array/string index out of bounds
  5. Infinite Recursion - Recursive calls without base case reduction
  6. Type Errors - String/integer concatenation issues
  7. Key Errors - Missing dictionary keys

πŸš€ Quick Start

Using Docker (Recommended)

from code_debug_env.client import DebugEnv

# Automatically starts Docker container and connects
env = DebugEnv.from_hub("openenv/code-debug-env")

# Reset to get first challenge
result = env.reset()
print(result.observation.buggy_code)
print(f"Expected output: {result.observation.expected_output}")

# Take action
from code_debug_env.models import DebugAction
action = DebugAction(action_type="test")
result = env.step(action)
print(f"Reward: {result.reward}")

# Cleanup
env.close()

πŸ”§ Integration with RL Frameworks

With TRL (Transformer Reinforcement Learning)

from trl import OnlineDPOConfig, OnlineDPOTrainer
from code_debug_env.client import DebugEnv

config = OnlineDPOConfig(...)
trainer = OnlineDPOTrainer(
    config=config,
    env=DebugEnv.from_hub("openenv/code-debug-env"),
    # ... other args
)
trainer.train()

πŸ† OpenEnv Challenge Submission

This environment is submitted to the OpenEnv Challenge: SOTA Environments to Drive General Intelligence (UC Berkeley AgentBeats Competition).

πŸ“œ License

Apache 2.0