Spaces:
Sleeping
Sleeping
| title: DiskPanic OpenEnv | |
| emoji: π₯ | |
| colorFrom: red | |
| colorTo: yellow | |
| sdk: docker | |
| app_port: 8000 | |
| pinned: false | |
| license: apache-2.0 | |
| # DiskPanic β SRE Incident Response OpenEnv | |
| A real-world RL environment where an LLM agent plays an on-call Site Reliability | |
| Engineer responding to a production incident: **the root filesystem is full and | |
| app.service has crashed.** The agent must free space, restart the service, and | |
| preserve business-critical audit logs β the wrong `rm -rf` tanks the reward. | |
| > Built for the OpenEnv Round 1 Hackathon by **Yash Pravin Pawar's team**. | |
| ## Why this env | |
| Every SRE has lived this exact 3am nightmare. The env tests three skills: | |
| 1. **Diagnosis** β finding the bloated file with `du` / `ls` / `find` | |
| 2. **Surgical deletion** β removing the right thing without touching protected dirs | |
| 3. **Recovery** β restarting services and (on hard) dropping a logrotate config to | |
| stop a runaway writer | |
| The reward signal is dense: the agent sees its score climb as disk usage drops, | |
| gets a bonus for restoring the service, and is penalized if the SHA-256 of | |
| `/var/log/audit/` changes. | |
| ## Tasks | |
| | ID | Scenario | Graded on | | |
| |----|----------|-----------| | |
| | `easy` | One 8.7 GiB rotated nginx log is filling the disk. | Disk usage < 80% + audit dir untouched | | |
| | `medium` | Disk full + `app.service` has failed. | disk(0.4) + service(0.4) + audit(0.2) | | |
| | `hard` | Same + a runaway writer grows `/var/log/app/runaway.log` by 100 MiB every tick. | disk(0.3) + service(0.3) + audit(0.2) + logrotate config(0.2) | | |
| All graders return a scalar in `[0.0, 1.0]`. | |
| ## Action space | |
| `DiskPanicAction(command: str)` β a single bash-lite command per step. Supported: | |
| ``` | |
| df ls <path> du <path> | |
| cat <path> find <path> sha256sum <path> | |
| rm [-rf] <path> systemctl is-active|status|start|restart <svc> | |
| echo "content" > <path> (for writing files like logrotate configs) | |
| ``` | |
| ## Observation space | |
| `DiskPanicObservation`: | |
| - `stdout: str` β output of the last command | |
| - `df_output: str` β current simulated `df -h /` | |
| - `service_status: str` β `active` / `inactive` / `failed` | |
| - `task_id: str` β current task (`easy` | `medium` | `hard`) | |
| - `step: int` | |
| - `last_error: Optional[str]` | |
| ## Safety & sandbox | |
| The env does not touch the real filesystem. Everything is a Python dict | |
| representing a virtual filesystem. Commands are parsed via `shlex` and | |
| dispatched to whitelisted operations β no `subprocess`, no shell expansion, | |
| no escape surface. This keeps the env deterministic, safe, and fast | |
| (runs easily on 2 vCPU / 8 GB RAM). | |
| ## Running locally | |
| ```bash | |
| # 1. Install | |
| pip install -r requirements.txt | |
| # 2. Build the Docker image | |
| docker build -t disk-panic:latest . | |
| # 3. Set env vars | |
| export HF_TOKEN=<your-key> # Groq key or HF token | |
| export API_BASE_URL=https://api.groq.com/openai/v1 | |
| export MODEL_NAME=llama-3.3-70b-versatile | |
| export IMAGE_NAME=disk-panic:latest | |
| # 4. Run inference (all 3 tasks) | |
| python inference.py | |
| ``` | |
| ## Deployment | |
| The env is deployed as a Hugging Face Space (Docker SDK). The FastAPI server | |
| is wired by `openenv.core.create_fastapi_app` and exposes the standard | |
| OpenEnv endpoints: `/reset`, `/step`, `/state`, `/schema`, `/health`, `/ws`, | |
| `/metadata`, `/web`. | |
| ## Layout | |
| ``` | |
| 8-DiskPanic/ | |
| βββ inference.py # required at root per hackathon spec | |
| βββ Dockerfile | |
| βββ openenv.yaml | |
| βββ requirements.txt | |
| βββ README.md | |
| βββ disk_panic/ | |
| βββ __init__.py # exports DiskPanicEnv, DiskPanicAction, DiskPanicObservation | |
| βββ models.py # Pydantic Action + Observation | |
| βββ client.py # EnvClient subclass | |
| βββ server/ | |
| βββ app.py # FastAPI app via create_fastapi_app | |
| βββ environment.py # DiskPanicEnvironment | |
| βββ scenarios.py # the 3 task builders | |
| βββ graders.py # deterministic reward functions | |
| βββ vfs.py # in-memory virtual FS + command parser | |
| ``` | |