Openenv / README.md
saravanatanjiro's picture
Switch SDK to docker to use custom Dockerfile and fix pip build
b4a2158
metadata
title: Openenv
emoji: ☁️
colorFrom: blue
colorTo: green
sdk: docker
sdk_version: 4.44.0
app_file: app.py
pinned: false

OpenEnv Hackathon Submission

Environment Architecture (OpenEnv Contract)

This project uses an explicit OpenEnv contract layer in code:

  • Core environment logic: cloud_arena/llm_environment.py -> AWSCostEnv
  • OpenEnv interface adapter: cloud_arena/llm_environment.py -> OpenEnvAdapter
  • Gym bridge used by training: cloud_arena/llm_environment.py -> SB3Adapter

Action space:

  • 0: NOOP
  • 1: CHECK_DEPENDENCIES
  • 2: RESIZE
  • 3: STOP
  • 4: DELETE

Reward shaping includes cost delta, risk, reliability, action quality, anti-loop penalties, and terminal outcome components.

Training Framework (Unsloth + GRPO)

The LLM training path actively uses Unsloth APIs in cloud_arena/llm_training.py:

  • from unsloth import FastLanguageModel
  • model loading via FastLanguageModel.from_pretrained(...)
  • LoRA wrapping via FastLanguageModel.get_peft_model(...)

The policy optimizer is a custom GRPO loop:

  • generate K samples per state
  • compute normalized relative advantages (reward - mean) / std
  • backpropagate loss across all K samples
  • step the real environment with the top-reward sample only

Results and Evidence

Temporary public evidence links (replace with final experiment images before final leaderboard review):

Artifact Links

Compliance Evidence Map

  • OpenEnv structure: cloud_arena/llm_environment.py
  • Unsloth integration: cloud_arena/llm_training.py
  • Training UI and runtime controls: app.py
  • Evidence/report document: README.md

Built for the OpenEnv Reinforcement Learning Hackathon.