CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
This is an OpenEnv-based reinforcement learning environment for training AI agents to optimize energy consumption and RAM usage in simulated computer systems. It deploys as a Docker-based Hugging Face Space using FastAPI.
Commands
# Install dependencies (uv is the package manager)
uv sync
# Run the server locally
uv run server
# Or directly with uvicorn:
uvicorn server.app:app --reload --host 0.0.0.0 --port 8000
# Run with custom port
uv run server --port 8001
# Build Docker image
docker build -t energy-optimization-rl .
# Run Docker container
docker run --rm -p 8000:8000 energy-optimization-rl
# Verify graders are discoverable
python check_graders.py # Basic verification
python check_graders.py verify # Import & callable check
python check_graders.py json # JSON manifest output
There are no tests in this project. The pyproject.toml lists pytest as a dev dependency but no test directory exists.
Architecture
Core Flow
openenv.yaml— Declarative config that defines 5 tasks with their grader references (graders:grade_*). This is what the OpenEnv validator reads first.server/app.py— FastAPI app created viaopenenv.core.env_server.http_server.create_app(), which wires up/reset,/step,/state,/schema, and/wsendpoints. Additional/graders,/tasks,/validateendpoints are added for validator tool detection.server/he_demo_environment.py— TheEnergyOptimizationEnvironmentclass implementingEnvironmentfrom openenv-core. Manages state transitions, reward calculation, and task progression.models.py— Pydantic models:EnergyOptimizationAction,EnergyOptimizationObservation,Task,TaskSummary. Also contains legacy grader functions that duplicate logic fromgraders.py.
Dual Grader System
There are two separate grader implementations that must be kept in sync:
graders.py— Self-contained, no external dependencies (uses onlygetattron observations). This is whatopenenv.yamlreferences (graders:grade_basic_ram_reduction, etc.). Must remain dependency-free because the OpenEnv validator imports it in a minimal environment.task_graders.py— Imports frommodels.py, has richer scoring with weighted formulas. Used by the server endpoints andtask_registry.py.
When modifying grader logic or adding tasks, update both graders.py and task_graders.py, plus the __all__ lists.
Grader Discovery Modules
There are multiple redundant discovery/manifest modules that the server endpoints import:
grader_manifest.py— Lightweight manifest dict for/graders/manifestendpointgraders_manifest.py— Detailed manifest with scoring methodology, performance examples, and validation checklistgrader_discovery.py— Discovery manifest with import paths and openenv referencestask_registry.py— Registry mapping task names to grader functions and metadata
Other Root Modules
client.py—EnergyOptimizationEnvsubclassingEnvClientfrom openenv-core. WebSocket-based client for connecting to the server.inference.py— LLM inference script with benchmarking (Random vs Heuristic vs LLM). Uses HF API. Configured via env vars (API_BASE_URL,MODEL_NAME,HF_TOKEN).evaluate_inference.py— Evaluation script for LLM performance across tasks.gym_wrapper.py—EnergyOptimizationGymEnvwrapping the environment in a Gymnasium interface for SB3 training.openenv-energy-rl/— Standalone sub-project with a simplerEnergyEnv(gym-based, not OpenEnv).
Key Design Points
- The environment starts at RAM 80%, energy 8.0 kWh, system load 0.7. Actions reduce these deterministically (no randomness — system dynamics are disabled).
- Reward is
intensity * 0.1clamped to [0, 1], with task completion bonuses ofdifficulty * 0.5. - Episodes end at 100 steps or when all 5 tasks are completed.
- The Dockerfile is multi-stage (
ghcr.io/meta-pytorch/openenv-base:latestbase), installs viauv sync, and runsuvicornfrom/app/env. PYTHONPATHin Docker includes both/app/envand/appfor grader discovery.- HF Space config: SDK docker, port 8000, base path
/web.
Task Definitions
All 5 tasks are defined in three places that must stay consistent:
openenv.yaml(task names, descriptions, grader references, max_steps)_create_tasks()inserver/he_demo_environment.py(Task objects with targets)- Grader functions in
graders.pyandtask_graders.py(scoring thresholds)
Tasks: basic_ram_reduction → energy_optimization → balanced_optimization → advanced_efficiency → expert_optimization (difficulty 1–5).