File size: 4,092 Bytes
b28f0f4
569c142
 
 
b28f0f4
 
569c142
b28f0f4
569c142
b28f0f4
 
569c142
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
---
title: DiskPanic OpenEnv
emoji: πŸ’₯
colorFrom: red
colorTo: yellow
sdk: docker
app_port: 8000
pinned: false
license: apache-2.0
---

# DiskPanic β€” SRE Incident Response OpenEnv

A real-world RL environment where an LLM agent plays an on-call Site Reliability
Engineer responding to a production incident: **the root filesystem is full and
app.service has crashed.** The agent must free space, restart the service, and
preserve business-critical audit logs β€” the wrong `rm -rf` tanks the reward.

> Built for the OpenEnv Round 1 Hackathon by **Yash Pravin Pawar's team**.

## Why this env

Every SRE has lived this exact 3am nightmare. The env tests three skills:
1. **Diagnosis** β€” finding the bloated file with `du` / `ls` / `find`
2. **Surgical deletion** β€” removing the right thing without touching protected dirs
3. **Recovery** β€” restarting services and (on hard) dropping a logrotate config to
   stop a runaway writer

The reward signal is dense: the agent sees its score climb as disk usage drops,
gets a bonus for restoring the service, and is penalized if the SHA-256 of
`/var/log/audit/` changes.

## Tasks

| ID | Scenario | Graded on |
|----|----------|-----------|
| `easy` | One 8.7 GiB rotated nginx log is filling the disk. | Disk usage < 80% + audit dir untouched |
| `medium` | Disk full + `app.service` has failed. | disk(0.4) + service(0.4) + audit(0.2) |
| `hard` | Same + a runaway writer grows `/var/log/app/runaway.log` by 100 MiB every tick. | disk(0.3) + service(0.3) + audit(0.2) + logrotate config(0.2) |

All graders return a scalar in `[0.0, 1.0]`.

## Action space

`DiskPanicAction(command: str)` β€” a single bash-lite command per step. Supported:

```
df                         ls <path>        du <path>
cat <path>                 find <path>      sha256sum <path>
rm [-rf] <path>            systemctl is-active|status|start|restart <svc>
echo "content" > <path>    (for writing files like logrotate configs)
```

## Observation space

`DiskPanicObservation`:

- `stdout: str` β€” output of the last command
- `df_output: str` β€” current simulated `df -h /`
- `service_status: str` β€” `active` / `inactive` / `failed`
- `task_id: str` β€” current task (`easy` | `medium` | `hard`)
- `step: int`
- `last_error: Optional[str]`

## Safety & sandbox

The env does not touch the real filesystem. Everything is a Python dict
representing a virtual filesystem. Commands are parsed via `shlex` and
dispatched to whitelisted operations β€” no `subprocess`, no shell expansion,
no escape surface. This keeps the env deterministic, safe, and fast
(runs easily on 2 vCPU / 8 GB RAM).

## Running locally

```bash
# 1. Install
pip install -r requirements.txt

# 2. Build the Docker image
docker build -t disk-panic:latest .

# 3. Set env vars
export HF_TOKEN=<your-key>                   # Groq key or HF token
export API_BASE_URL=https://api.groq.com/openai/v1
export MODEL_NAME=llama-3.3-70b-versatile
export IMAGE_NAME=disk-panic:latest

# 4. Run inference (all 3 tasks)
python inference.py
```

## Deployment

The env is deployed as a Hugging Face Space (Docker SDK). The FastAPI server
is wired by `openenv.core.create_fastapi_app` and exposes the standard
OpenEnv endpoints: `/reset`, `/step`, `/state`, `/schema`, `/health`, `/ws`,
`/metadata`, `/web`.

## Layout

```
8-DiskPanic/
β”œβ”€β”€ inference.py             # required at root per hackathon spec
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ openenv.yaml
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ README.md
└── disk_panic/
    β”œβ”€β”€ __init__.py          # exports DiskPanicEnv, DiskPanicAction, DiskPanicObservation
    β”œβ”€β”€ models.py            # Pydantic Action + Observation
    β”œβ”€β”€ client.py            # EnvClient subclass
    └── server/
        β”œβ”€β”€ app.py           # FastAPI app via create_fastapi_app
        β”œβ”€β”€ environment.py   # DiskPanicEnvironment
        β”œβ”€β”€ scenarios.py     # the 3 task builders
        β”œβ”€β”€ graders.py       # deterministic reward functions
        └── vfs.py           # in-memory virtual FS + command parser
```