--- title: ACRE - Autonomous Code Refactoring Environment colorFrom: blue colorTo: green sdk: docker app_file: server.py app_port: 7860 pinned: false license: mit tags: - openenv --- # ๐Ÿš€ ACRE โ€” Autonomous Code Refactoring Environment > OpenEnv-powered AI system for real-world code optimization, refactoring, and evaluation. ![Status](https://img.shields.io/badge/Status-Running-success) ![OpenEnv](https://img.shields.io/badge/OpenEnv-Compatible-blue) ![Docker](https://img.shields.io/badge/Docker-Ready-green) --- ## ๐Ÿ”ฅ Overview ACRE is an OpenEnv-compliant environment designed to simulate real-world software engineering workflows such as code cleanup, optimization, and refactoring using AI agents. It enables agents to iteratively improve code through structured actions while receiving dense, step-wise reward feedback. ## Environment Overview and Motivation ACRE models a realistic developer workflow where an agent incrementally improves Python code quality under a fixed action budget. The environment is designed for OpenEnv Round 1 requirements: typed APIs, deterministic grading, multi-difficulty tasks, and reproducible inference behavior. --- ## ๐Ÿ’ก Why This Matters Modern software systems require automated code optimization and intelligent tooling. ACRE enables: - ๐Ÿค– AI coding assistants - ๐Ÿ” Automated code review systems - โšก Reinforcement learning-based optimization agents - ๐Ÿง  Learning real developer workflows --- ## ๐Ÿ”„ How It Works Code โ†’ Action โ†’ Refactor โ†’ Reward โ†’ Repeat 1. Load messy code 2. Apply transformation 3. Evaluate using grader 4. Compute reward 5. Iterate until optimal --- ## ๐Ÿง  Key Features - โœ… Autonomous code refactoring - โšก Step-wise reward feedback - ๐Ÿงช OpenEnv compliant interface - ๐Ÿ“Š Deterministic grading system - ๐Ÿ” Reproducible inference pipeline - ๐Ÿณ Fully containerized (Docker + Hugging Face Spaces) --- ## ๐Ÿ“‚ Tasks | Task ID | Difficulty | Objective | |--------|----------|----------| | `rename_variables` | Easy | Replace generic variable names | | `remove_dead_code` | Medium | Remove unreachable logic | | `full_refactor` | Hard | Combine multiple optimizations | Each task uses AST-based transformations and deterministic grading. ## Task Descriptions with Expected Difficulty Levels - Easy (`rename_variables`): rename generic names like `x`, `tmp`, `i` into descriptive identifiers. - Medium (`remove_dead_code`): remove unreachable branches and unused assignments while preserving behavior. - Hard (`full_refactor`): combine renaming, dead-code elimination, loop simplification, condition cleanup, and helper inlining. --- ## ๐ŸŽฏ Reward System Rewards are computed at every step: - โœ… Valid executable code โ†’ positive reward - ๐Ÿ“‰ Reduced complexity โ†’ reward - โšก Improved performance โ†’ reward - โŒ Errors or invalid code โ†’ penalty - ๐Ÿ” No progress โ†’ penalty **Normalization:** `(raw_reward + 32) / 52 โ†’ [0, 1]` --- ## ๐Ÿ“Š Example Execution ```text [START] task=rename_variables [STEP] action=0 [END] task=rename_variables score=1.00 [START] task=remove_dead_code [STEP] action=1 [END] task=remove_dead_code score=0.25 [START] task=full_refactor [STEP] action=3 [END] task=full_refactor score=0.71 Final Score: 0.65 ``` --- ## ๐Ÿ—๏ธ Architecture - `server/app.py` โ†’ FastAPI entry point used by OpenEnv + Docker - `server.py` โ†’ legacy local runner / UI helper - `openenv_interface.py` โ†’ OpenEnv wrapper - `acre/env/` โ†’ Core environment logic - `acre/tasks/` โ†’ Task definitions - `acre/utils/` โ†’ Metrics and helpers - `inference.py` โ†’ Evaluation pipeline --- ## โš™๏ธ OpenEnv Interface ```python observation = env.reset() observation, reward, done, info = env.step(action) state = env.state() ``` Uses Pydantic models: - `ObservationModel` - `ActionModel` - `RewardModel` ## Definitions of Action and Observation Spaces - Observation space: Box(4) with fields `code_length`, `complexity_score`, `runtime_s`, `error_flag`. - Action space: Discrete(5) with actions `rename_variable`, `remove_dead_code`, `simplify_loop`, `optimize_condition`, `inline_function`. --- ## ๐ŸŒ HTTP API | Method | Endpoint | Description | |---|---|---| | GET | `/` | Health check | | GET | `/health` | Compatibility check | | POST | `/reset` | Reset environment | | POST | `/step` | Execute action | | GET | `/state` | Get state | | GET | `/tasks` | List tasks | | POST | `/tasks/{task_id}/grade` | Grade code | --- ## ๐Ÿš€ Run Locally ## Setup and Usage Instructions ```bash pip install -r requirements.txt uvicorn server.app:app --host 0.0.0.0 --port 7860 ``` --- ## ๐Ÿณ Docker / Hugging Face Spaces ```bash docker build -t acre . docker run -p 7860:7860 \ -e API_BASE_URL=https://api.openai.com/v1 \ -e MODEL_NAME=gpt-4o-mini \ -e API_KEY=your_key \ -e ENV_URL=http://localhost:7860 \ acre ``` --- ## ๐Ÿงช Inference Set environment variables: ```bash export API_BASE_URL=https://api.openai.com/v1 export MODEL_NAME=gpt-4o-mini export API_KEY=your_key export ENV_URL=http://localhost:7860 ``` Run: ```bash python inference.py ``` Expected output: ```text Easy: 1.00 Medium: 0.25 Hard: 0.71 Final: 0.65 ``` --- ## ๐Ÿ“Œ OpenEnv Compliance - โœ” `step()` implemented - โœ” `reset()` implemented - โœ” `state()` implemented - โœ” reward shaping - โœ” deterministic grading - โœ” structured logs --- ## ๐Ÿงช Validation ```bash python validate.py --url http://localhost:7860 ``` Or: ```bash openenv validate ``` --- ## ๐ŸŒ Live Demo ๐Ÿ‘‰ Running on Hugging Face Spaces --- ## ๐Ÿ“Š Baseline Performance ## Baseline Performance Scores | Task | Score | |---|---| | `rename_variables` | 1.0000 | | `remove_dead_code` | 0.2500 | | `full_refactor` | 0.7143 | | Average | 0.6548 | --- ## ๐Ÿ† Use Cases - AI-powered code optimization - Automated refactoring tools - Reinforcement learning environments - Developer productivity systems --- ## ๐Ÿ“œ License MIT License