title: ACRE - Autonomous Code Refactoring Environment
colorFrom: blue
colorTo: green
sdk: docker
app_file: server.py
app_port: 7860
pinned: false
license: mit
tags:
- openenv
π ACRE β Autonomous Code Refactoring Environment
OpenEnv-powered AI system for real-world code optimization, refactoring, and evaluation.
π₯ Overview
ACRE is an OpenEnv-compliant environment designed to simulate real-world software engineering workflows such as code cleanup, optimization, and refactoring using AI agents.
It enables agents to iteratively improve code through structured actions while receiving dense, step-wise reward feedback.
Environment Overview and Motivation
ACRE models a realistic developer workflow where an agent incrementally improves Python code quality under a fixed action budget. The environment is designed for OpenEnv Round 1 requirements: typed APIs, deterministic grading, multi-difficulty tasks, and reproducible inference behavior.
π‘ Why This Matters
Modern software systems require automated code optimization and intelligent tooling.
ACRE enables:
- π€ AI coding assistants
- π Automated code review systems
- β‘ Reinforcement learning-based optimization agents
- π§ Learning real developer workflows
π How It Works
Code β Action β Refactor β Reward β Repeat
- Load messy code
- Apply transformation
- Evaluate using grader
- Compute reward
- Iterate until optimal
π§ Key Features
- β Autonomous code refactoring
- β‘ Step-wise reward feedback
- π§ͺ OpenEnv compliant interface
- π Deterministic grading system
- π Reproducible inference pipeline
- π³ Fully containerized (Docker + Hugging Face Spaces)
π Tasks
| Task ID | Difficulty | Objective |
|---|---|---|
rename_variables |
Easy | Replace generic variable names |
remove_dead_code |
Medium | Remove unreachable logic |
full_refactor |
Hard | Combine multiple optimizations |
Each task uses AST-based transformations and deterministic grading.
Task Descriptions with Expected Difficulty Levels
- Easy (
rename_variables): rename generic names likex,tmp,iinto descriptive identifiers. - Medium (
remove_dead_code): remove unreachable branches and unused assignments while preserving behavior. - Hard (
full_refactor): combine renaming, dead-code elimination, loop simplification, condition cleanup, and helper inlining.
π― Reward System
Rewards are computed at every step:
- β Valid executable code β positive reward
- π Reduced complexity β reward
- β‘ Improved performance β reward
- β Errors or invalid code β penalty
- π No progress β penalty
Normalization:
(raw_reward + 32) / 52 β [0, 1]
π Example Execution
[START] task=rename_variables
[STEP] action=0
[END] task=rename_variables score=1.00
[START] task=remove_dead_code
[STEP] action=1
[END] task=remove_dead_code score=0.25
[START] task=full_refactor
[STEP] action=3
[END] task=full_refactor score=0.71
Final Score: 0.65
ποΈ Architecture
server/app.pyβ FastAPI entry point used by OpenEnv + Dockerserver.pyβ legacy local runner / UI helperopenenv_interface.pyβ OpenEnv wrapperacre/env/β Core environment logicacre/tasks/β Task definitionsacre/utils/β Metrics and helpersinference.pyβ Evaluation pipeline
βοΈ OpenEnv Interface
observation = env.reset()
observation, reward, done, info = env.step(action)
state = env.state()
Uses Pydantic models:
ObservationModelActionModelRewardModel
Definitions of Action and Observation Spaces
- Observation space: Box(4) with fields
code_length,complexity_score,runtime_s,error_flag. - Action space: Discrete(5) with actions
rename_variable,remove_dead_code,simplify_loop,optimize_condition,inline_function.
π HTTP API
| Method | Endpoint | Description |
|---|---|---|
| GET | / |
Health check |
| GET | /health |
Compatibility check |
| POST | /reset |
Reset environment |
| POST | /step |
Execute action |
| GET | /state |
Get state |
| GET | /tasks |
List tasks |
| POST | /tasks/{task_id}/grade |
Grade code |
π Run Locally
Setup and Usage Instructions
pip install -r requirements.txt
uvicorn server.app:app --host 0.0.0.0 --port 7860
π³ Docker / Hugging Face Spaces
docker build -t acre .
docker run -p 7860:7860 \
-e API_BASE_URL=https://api.openai.com/v1 \
-e MODEL_NAME=gpt-4o-mini \
-e API_KEY=your_key \
-e ENV_URL=http://localhost:7860 \
acre
π§ͺ Inference
Set environment variables:
export API_BASE_URL=https://api.openai.com/v1
export MODEL_NAME=gpt-4o-mini
export API_KEY=your_key
export ENV_URL=http://localhost:7860
Run:
python inference.py
Expected output:
Easy: 1.00
Medium: 0.25
Hard: 0.71
Final: 0.65
π OpenEnv Compliance
- β
step()implemented - β
reset()implemented - β
state()implemented - β reward shaping
- β deterministic grading
- β structured logs
π§ͺ Validation
python validate.py --url http://localhost:7860
Or:
openenv validate
π Live Demo
π Running on Hugging Face Spaces
π Baseline Performance
Baseline Performance Scores
| Task | Score |
|---|---|
rename_variables |
1.0000 |
remove_dead_code |
0.2500 |
full_refactor |
0.7143 |
| Average | 0.6548 |
π Use Cases
- AI-powered code optimization
- Automated refactoring tools
- Reinforcement learning environments
- Developer productivity systems
π License
MIT License