huggingmenfordays commited on
Commit
f28440f
·
verified ·
1 Parent(s): 4732653

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -235,6 +235,8 @@ the root `Dockerfile` installs from `pyproject.toml` and `uv.lock` via `uv sync
235
 
236
  for `openenv push`, the CLI stages the repository and promotes `server/Dockerfile` to the deployment root. `server/Dockerfile` mirrors the same metadata driven build path so local docker builds and pushed builds stay aligned.
237
 
 
 
238
  ```bash
239
  docker build -t sysadmin-env .
240
  ```
 
235
 
236
  for `openenv push`, the CLI stages the repository and promotes `server/Dockerfile` to the deployment root. `server/Dockerfile` mirrors the same metadata driven build path so local docker builds and pushed builds stay aligned.
237
 
238
+ the FastAPI server also includes a minimal OpenEnv web compatibility shim so pushed Spaces serve [`/web`](sysadmin_env/server.py) successfully. the shim keeps the existing [`/reset`](sysadmin_env/server.py), [`/step`](sysadmin_env/server.py), [`/state`](sysadmin_env/server.py), and [`/ws`](sysadmin_env/server.py) interfaces unchanged while adding helper routes at [`/web`](sysadmin_env/server.py), [`/web/metadata`](sysadmin_env/server.py), [`/web/reset`](sysadmin_env/server.py), [`/web/step`](sysadmin_env/server.py), and [`/web/state`](sysadmin_env/server.py).
239
+
240
  ```bash
241
  docker build -t sysadmin-env .
242
  ```
sysadmin_env.egg-info/PKG-INFO CHANGED
@@ -136,7 +136,7 @@ reward is shaped over the full trajectory.
136
  the baseline inference script supports the required submission variables.
137
 
138
  ```dotenv
139
- # preferred submission credential. `OPENAI_API_KEY` and `API_KEY` are also accepted.
140
  HF_TOKEN="your_api_key_here"
141
  MODEL_NAME="gpt-5.4"
142
  API_BASE_URL="https://api.openai.com/v1"
@@ -154,6 +154,13 @@ EPISODE_TIMEOUT_SECONDS="600"
154
 
155
  this repository follows the standard layout expected after `openenv init`.
156
 
 
 
 
 
 
 
 
157
  ```bash
158
  openenv validate
159
  ```
@@ -170,7 +177,7 @@ for reproducible local setup, `pyproject.toml` and `uv.lock` define the environm
170
  uv sync --extra dev
171
  ```
172
 
173
- the repository targets python 3.11 for validator parity and includes `.python-version` for `uv` discovery. if your host only exposes a newer incompatible interpreter, install the matching runtime first.
174
 
175
  ```bash
176
  uv python install 3.11
@@ -242,6 +249,10 @@ this repository is prepared for a Hugging Face docker space and the runtime entr
242
 
243
  the root `Dockerfile` installs from `pyproject.toml` and `uv.lock` via `uv sync --locked`, then starts the environment with `uv run server`.
244
 
 
 
 
 
245
  ```bash
246
  docker build -t sysadmin-env .
247
  ```
 
136
  the baseline inference script supports the required submission variables.
137
 
138
  ```dotenv
139
+ # submission credential used by the baseline agent.
140
  HF_TOKEN="your_api_key_here"
141
  MODEL_NAME="gpt-5.4"
142
  API_BASE_URL="https://api.openai.com/v1"
 
154
 
155
  this repository follows the standard layout expected after `openenv init`.
156
 
157
+ `openenv push` performs a stricter structure check than `openenv validate`. to satisfy that check, the repository includes thin required shim files at the root and under `server/`.
158
+
159
+ - `__init__.py`
160
+ - `client.py`
161
+ - `models.py`
162
+ - `server/Dockerfile`
163
+
164
  ```bash
165
  openenv validate
166
  ```
 
177
  uv sync --extra dev
178
  ```
179
 
180
+ the repository targets python 3.11 for validator parity and includes `.python-version` for `uv` discovery. if your host only exposes a newer interpreter outside this supported range, install the matching runtime first.
181
 
182
  ```bash
183
  uv python install 3.11
 
249
 
250
  the root `Dockerfile` installs from `pyproject.toml` and `uv.lock` via `uv sync --locked`, then starts the environment with `uv run server`.
251
 
252
+ for `openenv push`, the CLI stages the repository and promotes `server/Dockerfile` to the deployment root. `server/Dockerfile` mirrors the same metadata driven build path so local docker builds and pushed builds stay aligned.
253
+
254
+ the FastAPI server also includes a minimal OpenEnv web compatibility shim so pushed Spaces serve [`/web`](sysadmin_env/server.py) successfully. the shim keeps the existing [`/reset`](sysadmin_env/server.py), [`/step`](sysadmin_env/server.py), [`/state`](sysadmin_env/server.py), and [`/ws`](sysadmin_env/server.py) interfaces unchanged while adding helper routes at [`/web`](sysadmin_env/server.py), [`/web/metadata`](sysadmin_env/server.py), [`/web/reset`](sysadmin_env/server.py), [`/web/step`](sysadmin_env/server.py), and [`/web/state`](sysadmin_env/server.py).
255
+
256
  ```bash
257
  docker build -t sysadmin-env .
258
  ```
sysadmin_env.egg-info/SOURCES.txt CHANGED
@@ -1,5 +1,7 @@
1
  README.md
 
2
  inference.py
 
3
  pyproject.toml
4
  server/__init__.py
5
  server/app.py
 
1
  README.md
2
+ client.py
3
  inference.py
4
+ models.py
5
  pyproject.toml
6
  server/__init__.py
7
  server/app.py
sysadmin_env.egg-info/top_level.txt CHANGED
@@ -1,3 +1,5 @@
 
1
  inference
 
2
  server
3
  sysadmin_env
 
1
+ client
2
  inference
3
+ models
4
  server
5
  sysadmin_env
sysadmin_env/server.py CHANGED
@@ -12,6 +12,7 @@ from fastapi import FastAPI
12
  from fastapi import HTTPException
13
  from fastapi import WebSocket
14
  from fastapi import WebSocketDisconnect
 
15
  from fastapi.responses import JSONResponse
16
  from pydantic import ValidationError
17
 
@@ -140,6 +141,7 @@ class EpisodeManager:
140
 
141
  def create_app() -> FastAPI:
142
  manager = EpisodeManager(base_dir=Path.cwd() / "assets")
 
143
 
144
  @asynccontextmanager
145
  async def lifespan(app: FastAPI):
@@ -156,12 +158,7 @@ def create_app() -> FastAPI:
156
  app.state.episode_manager = manager
157
  app.state.http_session = HttpSessionState()
158
 
159
- @app.get("/health")
160
- async def health() -> JSONResponse:
161
- return JSONResponse({"status": "ok"})
162
-
163
- @app.post("/reset", response_model=StepResult)
164
- async def reset(payload: ResetRequest | None = None) -> StepResult:
165
  manager: EpisodeManager = app.state.episode_manager
166
  session: HttpSessionState = app.state.http_session
167
  if session.episode is not None:
@@ -191,8 +188,7 @@ def create_app() -> FastAPI:
191
  session.last_state = state
192
  return StepResult(observation=observation, state=state)
193
 
194
- @app.post("/step", response_model=StepResult)
195
- async def step(payload: StepRequest) -> StepResult:
196
  manager: EpisodeManager = app.state.episode_manager
197
  session: HttpSessionState = app.state.http_session
198
  if session.episode is None or session.episode_id is None:
@@ -208,6 +204,18 @@ def create_app() -> FastAPI:
208
  session.episode = None
209
  return StepResult(observation=observation, state=state)
210
 
 
 
 
 
 
 
 
 
 
 
 
 
211
  @app.get("/state", response_model=EnvironmentState)
212
  async def state() -> EnvironmentState:
213
  session: HttpSessionState = app.state.http_session
@@ -215,6 +223,30 @@ def create_app() -> FastAPI:
215
  raise HTTPException(status_code=404, detail="episode not initialized")
216
  return session.last_state
217
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
218
  @app.get("/tasks")
219
  async def tasks() -> JSONResponse:
220
  manager: EpisodeManager = app.state.episode_manager
@@ -323,6 +355,174 @@ def _merge_stderr(stderr: str, extra: str) -> str:
323
  return f"{stderr.rstrip()}\n{extra}"
324
 
325
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
326
  def _build_environment_state(episode: EpisodeState, episode_id: str, observation: Observation) -> EnvironmentState:
327
  return EnvironmentState(
328
  episode_id=episode_id,
 
12
  from fastapi import HTTPException
13
  from fastapi import WebSocket
14
  from fastapi import WebSocketDisconnect
15
+ from fastapi.responses import HTMLResponse
16
  from fastapi.responses import JSONResponse
17
  from pydantic import ValidationError
18
 
 
141
 
142
  def create_app() -> FastAPI:
143
  manager = EpisodeManager(base_dir=Path.cwd() / "assets")
144
+ web_metadata_payload = _build_web_metadata()
145
 
146
  @asynccontextmanager
147
  async def lifespan(app: FastAPI):
 
158
  app.state.episode_manager = manager
159
  app.state.http_session = HttpSessionState()
160
 
161
+ async def reset_episode(payload: ResetRequest | None = None) -> StepResult:
 
 
 
 
 
162
  manager: EpisodeManager = app.state.episode_manager
163
  session: HttpSessionState = app.state.http_session
164
  if session.episode is not None:
 
188
  session.last_state = state
189
  return StepResult(observation=observation, state=state)
190
 
191
+ async def step_episode(payload: StepRequest) -> StepResult:
 
192
  manager: EpisodeManager = app.state.episode_manager
193
  session: HttpSessionState = app.state.http_session
194
  if session.episode is None or session.episode_id is None:
 
204
  session.episode = None
205
  return StepResult(observation=observation, state=state)
206
 
207
+ @app.get("/health")
208
+ async def health() -> JSONResponse:
209
+ return JSONResponse({"status": "ok"})
210
+
211
+ @app.post("/reset", response_model=StepResult)
212
+ async def reset(payload: ResetRequest | None = None) -> StepResult:
213
+ return await reset_episode(payload)
214
+
215
+ @app.post("/step", response_model=StepResult)
216
+ async def step(payload: StepRequest) -> StepResult:
217
+ return await step_episode(payload)
218
+
219
  @app.get("/state", response_model=EnvironmentState)
220
  async def state() -> EnvironmentState:
221
  session: HttpSessionState = app.state.http_session
 
223
  raise HTTPException(status_code=404, detail="episode not initialized")
224
  return session.last_state
225
 
226
+ @app.get("/web", response_class=HTMLResponse)
227
+ @app.get("/web/", response_class=HTMLResponse)
228
+ async def web_interface() -> str:
229
+ return _render_web_interface_html()
230
+
231
+ @app.get("/web/metadata")
232
+ async def web_metadata() -> JSONResponse:
233
+ return JSONResponse(web_metadata_payload)
234
+
235
+ @app.post("/web/reset")
236
+ async def web_reset(payload: ResetRequest | None = None) -> JSONResponse:
237
+ result = await reset_episode(payload)
238
+ return JSONResponse(_build_web_step_result(result))
239
+
240
+ @app.post("/web/step")
241
+ async def web_step(payload: dict[str, Any]) -> JSONResponse:
242
+ result = await step_episode(_parse_web_step_request(payload))
243
+ return JSONResponse(_build_web_step_result(result))
244
+
245
+ @app.get("/web/state")
246
+ async def web_state() -> JSONResponse:
247
+ session: HttpSessionState = app.state.http_session
248
+ return JSONResponse(_build_web_state(session))
249
+
250
  @app.get("/tasks")
251
  async def tasks() -> JSONResponse:
252
  manager: EpisodeManager = app.state.episode_manager
 
355
  return f"{stderr.rstrip()}\n{extra}"
356
 
357
 
358
+ def _build_web_metadata() -> dict[str, Any]:
359
+ return {
360
+ "name": "sysadmin-env",
361
+ "description": "Shell-based sysadmin environment with OpenEnv-compatible web shim routes.",
362
+ "readme_content": _load_readme_content(),
363
+ "documentation_url": "/docs",
364
+ }
365
+
366
+
367
+ def _load_readme_content() -> str | None:
368
+ readme_path = Path(__file__).resolve().parents[1] / "README.md"
369
+ try:
370
+ return readme_path.read_text(encoding="utf-8")
371
+ except OSError:
372
+ return None
373
+
374
+
375
+ def _build_web_step_result(result: StepResult) -> dict[str, Any]:
376
+ observation = result.observation.model_dump()
377
+ return {
378
+ "observation": observation,
379
+ "reward": result.observation.reward,
380
+ "done": result.observation.done,
381
+ "state": result.state.model_dump(),
382
+ }
383
+
384
+
385
+ def _build_web_state(session: HttpSessionState) -> dict[str, Any]:
386
+ if session.last_state is None:
387
+ return {
388
+ "episode_id": None,
389
+ "task_id": None,
390
+ "step_count": 0,
391
+ "max_steps": 0,
392
+ "done": False,
393
+ "reward": 0.0,
394
+ "initialized": False,
395
+ }
396
+
397
+ payload = session.last_state.model_dump()
398
+ payload["initialized"] = True
399
+ return payload
400
+
401
+
402
+ def _parse_web_step_request(payload: dict[str, Any]) -> StepRequest:
403
+ action_payload = payload.get("action", payload)
404
+ if not isinstance(action_payload, dict):
405
+ raise HTTPException(status_code=422, detail="action payload must be an object")
406
+
407
+ try:
408
+ action = Action.model_validate(action_payload)
409
+ except ValidationError as exc:
410
+ raise HTTPException(status_code=422, detail=exc.errors()) from exc
411
+
412
+ return StepRequest(action=action)
413
+
414
+
415
+ def _render_web_interface_html() -> str:
416
+ return """<!doctype html>
417
+ <html lang=\"en\">
418
+ <head>
419
+ <meta charset=\"utf-8\">
420
+ <meta name=\"viewport\" content=\"width=device-width, initial-scale=1\">
421
+ <title>sysadmin-env web shim</title>
422
+ <style>
423
+ body { font-family: system-ui, sans-serif; margin: 2rem auto; max-width: 960px; padding: 0 1rem; }
424
+ h1, h2 { margin-bottom: 0.5rem; }
425
+ .panel { border: 1px solid #d0d7de; border-radius: 8px; padding: 1rem; margin-bottom: 1rem; }
426
+ .row { display: flex; gap: 0.75rem; flex-wrap: wrap; margin-bottom: 0.75rem; }
427
+ input, select, button, textarea { font: inherit; padding: 0.5rem; }
428
+ input, select, textarea { min-width: 240px; }
429
+ textarea { width: 100%; min-height: 6rem; }
430
+ pre { background: #0d1117; color: #e6edf3; padding: 1rem; overflow-x: auto; border-radius: 6px; }
431
+ code { background: #f6f8fa; padding: 0.1rem 0.3rem; border-radius: 4px; }
432
+ </style>
433
+ </head>
434
+ <body>
435
+ <h1>sysadmin-env web compatibility shim</h1>
436
+ <p>This page exposes the OpenEnv-compatible helper routes for the existing FastAPI environment without changing the primary HTTP or websocket API.</p>
437
+
438
+ <div class=\"panel\">
439
+ <h2>Reset</h2>
440
+ <div class=\"row\">
441
+ <select id=\"task-id\"></select>
442
+ <button id=\"reset-button\" type=\"button\">POST /web/reset</button>
443
+ <button id=\"state-button\" type=\"button\">GET /web/state</button>
444
+ <button id=\"metadata-button\" type=\"button\">GET /web/metadata</button>
445
+ </div>
446
+ </div>
447
+
448
+ <div class=\"panel\">
449
+ <h2>Step</h2>
450
+ <div class=\"row\">
451
+ <input id=\"command\" type=\"text\" placeholder=\"echo hello\">
452
+ <input id=\"reasoning\" type=\"text\" placeholder=\"optional reasoning\">
453
+ <button id=\"step-button\" type=\"button\">POST /web/step</button>
454
+ </div>
455
+ <p>Route contract: <code>{\"action\": {\"command\": \"...\", \"reasoning\": \"...\"}}</code></p>
456
+ </div>
457
+
458
+ <div class=\"panel\">
459
+ <h2>Response</h2>
460
+ <pre id=\"output\">loading tasks...</pre>
461
+ </div>
462
+
463
+ <script>
464
+ const output = document.getElementById('output');
465
+ const taskSelect = document.getElementById('task-id');
466
+
467
+ async function showResponse(response) {
468
+ const text = await response.text();
469
+ try {
470
+ output.textContent = JSON.stringify(JSON.parse(text), null, 2);
471
+ } catch {
472
+ output.textContent = text;
473
+ }
474
+ }
475
+
476
+ async function loadTasks() {
477
+ const response = await fetch('/tasks');
478
+ const payload = await response.json();
479
+ taskSelect.innerHTML = payload.tasks.map((task) => `<option value="${task.task_id}">${task.task_id}</option>`).join('');
480
+ output.textContent = JSON.stringify(payload, null, 2);
481
+ }
482
+
483
+ document.getElementById('reset-button').addEventListener('click', async () => {
484
+ const response = await fetch('/web/reset', {
485
+ method: 'POST',
486
+ headers: { 'Content-Type': 'application/json' },
487
+ body: JSON.stringify({ task_id: taskSelect.value || null }),
488
+ });
489
+ await showResponse(response);
490
+ });
491
+
492
+ document.getElementById('step-button').addEventListener('click', async () => {
493
+ const payload = {
494
+ action: {
495
+ command: document.getElementById('command').value,
496
+ reasoning: document.getElementById('reasoning').value || null,
497
+ },
498
+ };
499
+ const response = await fetch('/web/step', {
500
+ method: 'POST',
501
+ headers: { 'Content-Type': 'application/json' },
502
+ body: JSON.stringify(payload),
503
+ });
504
+ await showResponse(response);
505
+ });
506
+
507
+ document.getElementById('state-button').addEventListener('click', async () => {
508
+ const response = await fetch('/web/state');
509
+ await showResponse(response);
510
+ });
511
+
512
+ document.getElementById('metadata-button').addEventListener('click', async () => {
513
+ const response = await fetch('/web/metadata');
514
+ await showResponse(response);
515
+ });
516
+
517
+ loadTasks().catch((error) => {
518
+ output.textContent = `Failed to load tasks: ${error.message}`;
519
+ });
520
+ </script>
521
+ </body>
522
+ </html>
523
+ """
524
+
525
+
526
  def _build_environment_state(episode: EpisodeState, episode_id: str, observation: Observation) -> EnvironmentState:
527
  return EnvironmentState(
528
  episode_id=episode_id,
tests/test_server.py CHANGED
@@ -94,6 +94,60 @@ def test_http_step_requires_reset(monkeypatch):
94
  assert response.status_code == 409
95
 
96
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
97
  def test_websocket_handles_valid_invalid_and_timeout_actions(monkeypatch):
98
  client = _make_client(monkeypatch)
99
 
 
94
  assert response.status_code == 409
95
 
96
 
97
+ def test_web_routes_expose_compatibility_shim_and_delegate_to_current_environment(monkeypatch):
98
+ client = _make_client(monkeypatch)
99
+
100
+ web_response = client.get("/web")
101
+ assert web_response.status_code == 200
102
+ assert "sysadmin-env web compatibility shim" in web_response.text
103
+
104
+ metadata_response = client.get("/web/metadata")
105
+ assert metadata_response.status_code == 200
106
+ metadata_payload = metadata_response.json()
107
+ assert metadata_payload["name"] == "sysadmin-env"
108
+ assert "OpenEnv-compatible web shim routes" in metadata_payload["description"]
109
+
110
+ pre_reset_state_response = client.get("/web/state")
111
+ assert pre_reset_state_response.status_code == 200
112
+ assert pre_reset_state_response.json() == {
113
+ "episode_id": None,
114
+ "task_id": None,
115
+ "step_count": 0,
116
+ "max_steps": 0,
117
+ "done": False,
118
+ "reward": 0.0,
119
+ "initialized": False,
120
+ }
121
+
122
+ reset_response = client.post("/web/reset", json={"task_id": "nginx_crash"})
123
+ assert reset_response.status_code == 200
124
+ reset_payload = reset_response.json()
125
+ assert reset_payload["state"]["task_id"] == "nginx_crash"
126
+ assert reset_payload["observation"]["step_number"] == 0
127
+ assert reset_payload["reward"] == 0.0
128
+ assert reset_payload["done"] is False
129
+
130
+ step_response = client.post("/web/step", json={"action": {"command": "echo hello"}})
131
+ assert step_response.status_code == 200
132
+ step_payload = step_response.json()
133
+ assert step_payload["observation"]["stdout"] == "ran echo hello"
134
+ assert step_payload["state"]["step_count"] == 1
135
+ assert step_payload["reward"] == step_payload["observation"]["reward"]
136
+ assert step_payload["done"] is False
137
+
138
+ post_step_state_response = client.get("/web/state")
139
+ assert post_step_state_response.status_code == 200
140
+ assert post_step_state_response.json() == {
141
+ "episode_id": reset_payload["state"]["episode_id"],
142
+ "task_id": "nginx_crash",
143
+ "step_count": 1,
144
+ "max_steps": 40,
145
+ "done": False,
146
+ "reward": step_payload["reward"],
147
+ "initialized": True,
148
+ }
149
+
150
+
151
  def test_websocket_handles_valid_invalid_and_timeout_actions(monkeypatch):
152
  client = _make_client(monkeypatch)
153