Spaces:

yashppawar
/

disk-panic-openenv

Sleeping

App Files Files Community

disk-panic-openenv / README.md

yashppawar

Initial DiskPanic OpenEnv submission

569c142 verified 7 days ago

preview code

raw

history blame contribute delete

4.09 kB

	---
	title: DiskPanic OpenEnv
	emoji: 💥
	colorFrom: red
	colorTo: yellow
	sdk: docker
	app_port: 8000
	pinned: false
	license: apache-2.0
	---

	# DiskPanic — SRE Incident Response OpenEnv

	A real-world RL environment where an LLM agent plays an on-call Site Reliability
	Engineer responding to a production incident: **the root filesystem is full and
	app.service has crashed.** The agent must free space, restart the service, and
	preserve business-critical audit logs — the wrong `rm -rf` tanks the reward.

	> Built for the OpenEnv Round 1 Hackathon by Yash Pravin Pawar's team.

	## Why this env

	Every SRE has lived this exact 3am nightmare. The env tests three skills:
	1. Diagnosis — finding the bloated file with `du` / `ls` / `find`
	2. Surgical deletion — removing the right thing without touching protected dirs
	3. Recovery — restarting services and (on hard) dropping a logrotate config to
	stop a runaway writer

	The reward signal is dense: the agent sees its score climb as disk usage drops,
	gets a bonus for restoring the service, and is penalized if the SHA-256 of
	`/var/log/audit/` changes.

	## Tasks

	\| ID \| Scenario \| Graded on \|
	\|----\|----------\|-----------\|
	\| `easy` \| One 8.7 GiB rotated nginx log is filling the disk. \| Disk usage < 80% + audit dir untouched \|
	\| `medium` \| Disk full + `app.service` has failed. \| disk(0.4) + service(0.4) + audit(0.2) \|
	\| `hard` \| Same + a runaway writer grows `/var/log/app/runaway.log` by 100 MiB every tick. \| disk(0.3) + service(0.3) + audit(0.2) + logrotate config(0.2) \|

	All graders return a scalar in `[0.0, 1.0]`.

	## Action space

	`DiskPanicAction(command: str)` — a single bash-lite command per step. Supported:

	```
	df ls <path> du <path>
	cat <path> find <path> sha256sum <path>
	rm [-rf] <path> systemctl is-active\|status\|start\|restart <svc>
	echo "content" > <path> (for writing files like logrotate configs)
	```

	## Observation space

	`DiskPanicObservation`:

	- `stdout: str` — output of the last command
	- `df_output: str` — current simulated `df -h /`
	- `service_status: str` — `active` / `inactive` / `failed`
	- `task_id: str` — current task (`easy` \| `medium` \| `hard`)
	- `step: int`
	- `last_error: Optional[str]`

	## Safety & sandbox

	The env does not touch the real filesystem. Everything is a Python dict
	representing a virtual filesystem. Commands are parsed via `shlex` and
	dispatched to whitelisted operations — no `subprocess`, no shell expansion,
	no escape surface. This keeps the env deterministic, safe, and fast
	(runs easily on 2 vCPU / 8 GB RAM).

	## Running locally

	```bash
	# 1. Install
	pip install -r requirements.txt

	# 2. Build the Docker image
	docker build -t disk-panic:latest .

	# 3. Set env vars
	export HF_TOKEN=<your-key> # Groq key or HF token
	export API_BASE_URL=https://api.groq.com/openai/v1
	export MODEL_NAME=llama-3.3-70b-versatile
	export IMAGE_NAME=disk-panic:latest

	# 4. Run inference (all 3 tasks)
	python inference.py
	```

	## Deployment

	The env is deployed as a Hugging Face Space (Docker SDK). The FastAPI server
	is wired by `openenv.core.create_fastapi_app` and exposes the standard
	OpenEnv endpoints: `/reset`, `/step`, `/state`, `/schema`, `/health`, `/ws`,
	`/metadata`, `/web`.

	## Layout

	```
	8-DiskPanic/
	├── inference.py # required at root per hackathon spec
	├── Dockerfile
	├── openenv.yaml
	├── requirements.txt
	├── README.md
	└── disk_panic/
	├── __init__.py # exports DiskPanicEnv, DiskPanicAction, DiskPanicObservation
	├── models.py # Pydantic Action + Observation
	├── client.py # EnvClient subclass
	└── server/
	├── app.py # FastAPI app via create_fastapi_app
	├── environment.py # DiskPanicEnvironment
	├── scenarios.py # the 3 task builders
	├── graders.py # deterministic reward functions
	└── vfs.py # in-memory virtual FS + command parser
	```