PRANAV05092003's picture
Final multi-mode OpenEnv fix
19e4a1d
---
title: ACRE - Autonomous Code Refactoring Environment
colorFrom: blue
colorTo: green
sdk: docker
app_file: server.py
app_port: 7860
pinned: false
license: mit
tags:
- openenv
---
# πŸš€ ACRE β€” Autonomous Code Refactoring Environment
> OpenEnv-powered AI system for real-world code optimization, refactoring, and evaluation.
![Status](https://img.shields.io/badge/Status-Running-success)
![OpenEnv](https://img.shields.io/badge/OpenEnv-Compatible-blue)
![Docker](https://img.shields.io/badge/Docker-Ready-green)
---
## πŸ”₯ Overview
ACRE is an OpenEnv-compliant environment designed to simulate real-world software engineering workflows such as code cleanup, optimization, and refactoring using AI agents.
It enables agents to iteratively improve code through structured actions while receiving dense, step-wise reward feedback.
## Environment Overview and Motivation
ACRE models a realistic developer workflow where an agent incrementally improves Python code quality under a fixed action budget.
The environment is designed for OpenEnv Round 1 requirements: typed APIs, deterministic grading, multi-difficulty tasks, and reproducible inference behavior.
---
## πŸ’‘ Why This Matters
Modern software systems require automated code optimization and intelligent tooling.
ACRE enables:
- πŸ€– AI coding assistants
- πŸ” Automated code review systems
- ⚑ Reinforcement learning-based optimization agents
- 🧠 Learning real developer workflows
---
## πŸ”„ How It Works
Code β†’ Action β†’ Refactor β†’ Reward β†’ Repeat
1. Load messy code
2. Apply transformation
3. Evaluate using grader
4. Compute reward
5. Iterate until optimal
---
## 🧠 Key Features
- βœ… Autonomous code refactoring
- ⚑ Step-wise reward feedback
- πŸ§ͺ OpenEnv compliant interface
- πŸ“Š Deterministic grading system
- πŸ” Reproducible inference pipeline
- 🐳 Fully containerized (Docker + Hugging Face Spaces)
---
## πŸ“‚ Tasks
| Task ID | Difficulty | Objective |
|--------|----------|----------|
| `rename_variables` | Easy | Replace generic variable names |
| `remove_dead_code` | Medium | Remove unreachable logic |
| `full_refactor` | Hard | Combine multiple optimizations |
Each task uses AST-based transformations and deterministic grading.
## Task Descriptions with Expected Difficulty Levels
- Easy (`rename_variables`): rename generic names like `x`, `tmp`, `i` into descriptive identifiers.
- Medium (`remove_dead_code`): remove unreachable branches and unused assignments while preserving behavior.
- Hard (`full_refactor`): combine renaming, dead-code elimination, loop simplification, condition cleanup, and helper inlining.
---
## 🎯 Reward System
Rewards are computed at every step:
- βœ… Valid executable code β†’ positive reward
- πŸ“‰ Reduced complexity β†’ reward
- ⚑ Improved performance β†’ reward
- ❌ Errors or invalid code β†’ penalty
- πŸ” No progress β†’ penalty
**Normalization:**
`(raw_reward + 32) / 52 β†’ [0, 1]`
---
## πŸ“Š Example Execution
```text
[START] task=rename_variables
[STEP] action=0
[END] task=rename_variables score=1.00
[START] task=remove_dead_code
[STEP] action=1
[END] task=remove_dead_code score=0.25
[START] task=full_refactor
[STEP] action=3
[END] task=full_refactor score=0.71
Final Score: 0.65
```
---
## πŸ—οΈ Architecture
- `server/app.py` β†’ FastAPI entry point used by OpenEnv + Docker
- `server.py` β†’ legacy local runner / UI helper
- `openenv_interface.py` β†’ OpenEnv wrapper
- `acre/env/` β†’ Core environment logic
- `acre/tasks/` β†’ Task definitions
- `acre/utils/` β†’ Metrics and helpers
- `inference.py` β†’ Evaluation pipeline
---
## βš™οΈ OpenEnv Interface
```python
observation = env.reset()
observation, reward, done, info = env.step(action)
state = env.state()
```
Uses Pydantic models:
- `ObservationModel`
- `ActionModel`
- `RewardModel`
## Definitions of Action and Observation Spaces
- Observation space: Box(4) with fields `code_length`, `complexity_score`, `runtime_s`, `error_flag`.
- Action space: Discrete(5) with actions `rename_variable`, `remove_dead_code`, `simplify_loop`, `optimize_condition`, `inline_function`.
---
## 🌐 HTTP API
| Method | Endpoint | Description |
|---|---|---|
| GET | `/` | Health check |
| GET | `/health` | Compatibility check |
| POST | `/reset` | Reset environment |
| POST | `/step` | Execute action |
| GET | `/state` | Get state |
| GET | `/tasks` | List tasks |
| POST | `/tasks/{task_id}/grade` | Grade code |
---
## πŸš€ Run Locally
## Setup and Usage Instructions
```bash
pip install -r requirements.txt
uvicorn server.app:app --host 0.0.0.0 --port 7860
```
---
## 🐳 Docker / Hugging Face Spaces
```bash
docker build -t acre .
docker run -p 7860:7860 \
-e API_BASE_URL=https://api.openai.com/v1 \
-e MODEL_NAME=gpt-4o-mini \
-e API_KEY=your_key \
-e ENV_URL=http://localhost:7860 \
acre
```
---
## πŸ§ͺ Inference
Set environment variables:
```bash
export API_BASE_URL=https://api.openai.com/v1
export MODEL_NAME=gpt-4o-mini
export API_KEY=your_key
export ENV_URL=http://localhost:7860
```
Run:
```bash
python inference.py
```
Expected output:
```text
Easy: 1.00
Medium: 0.25
Hard: 0.71
Final: 0.65
```
---
## πŸ“Œ OpenEnv Compliance
- βœ” `step()` implemented
- βœ” `reset()` implemented
- βœ” `state()` implemented
- βœ” reward shaping
- βœ” deterministic grading
- βœ” structured logs
---
## πŸ§ͺ Validation
```bash
python validate.py --url http://localhost:7860
```
Or:
```bash
openenv validate
```
---
## 🌐 Live Demo
πŸ‘‰ Running on Hugging Face Spaces
---
## πŸ“Š Baseline Performance
## Baseline Performance Scores
| Task | Score |
|---|---|
| `rename_variables` | 1.0000 |
| `remove_dead_code` | 0.2500 |
| `full_refactor` | 0.7143 |
| Average | 0.6548 |
---
## πŸ† Use Cases
- AI-powered code optimization
- Automated refactoring tools
- Reinforcement learning environments
- Developer productivity systems
---
## πŸ“œ License
MIT License