---
title: Self-Healing DevOps Sandbox
emoji: 🔧
colorFrom: red
colorTo: green
sdk: docker
pinned: false
app_port: 8000
base_path: /web
tags:
  - openenv
---

# Self-Healing DevOps Sandbox

An OpenEnv RL environment where an AI agent is dropped into a **broken Node.js backend** inside a Docker container. The agent must use **bash commands only** to diagnose bugs, edit files, and fix the app -- just like a real DevOps engineer would.

Built for the **Meta PyTorch OpenEnv Hackathon**.

---

## What Is This?

A 3-task challenge of increasing difficulty. The agent starts in a Docker container with a broken Express.js app in `/app` and must make all endpoints healthy.

| # | Difficulty | Bug             | What's Wrong                          |
|---|-----------|-----------------|---------------------------------------|
| 1 | Easy      | `config.json`    | Port set to `9999` instead of `3000`  |
| 2 | Medium    | `routes/users.js`| Missing `)` causes SyntaxError crash  |
| 3 | Hard      | `routes/data.js` | Missing `await` causes HTTP 500       |

**Goal:** Fix all bugs so these endpoints return HTTP 200:
- `GET /health` returns `{"status": "ok"}`
- `GET /api/users` returns `{"users": [...]}`
- `GET /api/data` returns `{"records": [...]}`

---

## Scoring (Partial Rewards)

The grader runs **after every command** and awards cumulative points:

| Milestone                        | Points | Total    |
|----------------------------------|--------|----------|
| App starts on port 3000          | +0.35  | 0.35     |
| `/health` returns 200            | +0.10  | 0.45     |
| `/api/users` returns valid JSON  | +0.15  | 0.60     |
| `/api/data` returns valid JSON   | +0.25  | 0.85     |
| All endpoints correct            | +0.15  | **1.00** |

---

## Getting Started

### Prerequisites

- **Python 3.10+**
- **Docker Desktop** (running)
- **uv** package manager (`pip install uv`)

### 1. Install Dependencies

```bash
cd devops_sandbox
uv sync
```

### 2. Build the Sandbox Docker Image

```bash
docker build -t devops-sandbox-node:latest -f simulated_app/Dockerfile simulated_app/
```

### 3. Start the Environment Server

```bash
uv run server
```

The server starts at `http://localhost:8000`.

### 4. Run the Baseline Agent

In a **separate terminal**:

```bash
# Set your OpenAI API key
export OPENAI_API_KEY="sk-..."          # Linux/Mac
$env:OPENAI_API_KEY = "sk-..."          # PowerShell

# Run the baseline
uv run python baseline.py
```

---

## Test Your Own Agent

### Option A: Use the Python Client

```python
from devops_sandbox import BashAction, DevopsSandboxEnv

with DevopsSandboxEnv(base_url="http://localhost:8000").sync() as env:
    # Reset creates a fresh Docker container
    result = env.reset()
    print(result.observation.stdout)       # Task description
    print(result.observation.grader_score)  # 0.0

    # Send bash commands
    result = env.step(BashAction(command="cat /app/config.json"))
    print(result.observation.stdout)       # File contents
    print(result.observation.grader_score)  # Score after grading

    # Fix a bug
    result = env.step(BashAction(command="sed -i 's/9999/3000/' /app/config.json"))
    print(result.observation.grader_score)  # Partial score

    # Check if done
    if result.done:
        print("Episode complete!")
```

### Option B: Use the REST API Directly

```bash
# Reset the environment
curl -X POST http://localhost:8000/reset

# Send a command
curl -X POST http://localhost:8000/step \
  -H "Content-Type: application/json" \
  -d '{"action": {"command": "ls -la /app"}}'
```

### Option C: Use the WebSocket Endpoint

Connect to `ws://localhost:8000/ws` for persistent sessions.

---

## Project Structure

```
devops_sandbox/
|-- openenv.yaml                 # OpenEnv manifest
|-- pyproject.toml               # Python dependencies
|-- README.md                    # This file
|-- baseline.py                  # LLM-powered baseline agent
|-- models.py                    # BashAction & TerminalObservation schemas
|-- client.py                    # Python client for the environment
|
|-- server/
|   |-- app.py                   # FastAPI server (entry point)
|   +-- devops_sandbox_environment.py  # Environment logic + grader
|
+-- simulated_app/               # The broken Node.js app (Docker context)
    |-- Dockerfile               # node:20-slim sandbox container
    |-- package.json             # Express.js project
    |-- server.js                # Main entry point
    |-- config.json              # Bug 1: wrong port
    +-- routes/
        |-- users.js             # Bug 2: syntax error
        +-- data.js              # Bug 3: missing await
```

---

## How It Works

```
+-----------+   BashAction    +------------+   docker exec   +--------------+
|  Agent    | --------------> |  OpenEnv   | --------------> |  Docker      |
| (LLM/RL) |                 |  Server    |                 |  Container   |
|           | <-------------- |  (8000)    | <-------------- |  (broken app)|
+-----------+  Observation    +-----+------+   stdout/stderr +--------------+
               + grader_score       |
                              +-----+------+
                              |   Grader   |
                              | (curl test |
                              |  endpoints)|
                              +------------+
```

1. **Agent** sends a `BashAction` (e.g., `cat /app/config.json`)
2. **Server** runs it inside the Docker container via `docker exec`
3. **Grader** restarts the Node app and curls all endpoints
4. **Observation** returns: stdout, stderr, score (0.0-1.0), feedback

---

## Configuration

| Env Variable        | Default                  | Description                        |
|--------------------|--------------------------|------------------------------------|
| `OPENAI_API_KEY`    | *(required)*             | OpenAI API key for baseline        |
| `OPENAI_MODEL`      | `gpt-4o-mini`            | LLM model to use                   |
| `OPENAI_BASE_URL`   | *(OpenAI default)*       | Custom endpoint (Ollama, vLLM)     |
| `MAX_TURNS`         | `30`                     | Max steps per episode              |
| `DEVOPS_SANDBOX_URL`| `http://localhost:8000`  | Environment server URL             |

### Use with Local LLMs (Ollama, vLLM)

```bash
export OPENAI_BASE_URL="http://localhost:11434/v1"
export OPENAI_MODEL="llama3"
export OPENAI_API_KEY="dummy"
uv run python baseline.py
```

---

## Validation

```bash
uv run openenv validate
# Expected: [OK] devops_sandbox: Ready for multi-mode deployment
```

---

## License

BSD-style license. See LICENSE for details.