Spaces:

DEVessi
/

devops_sandbox

Runtime error

File size: 6,850 Bytes

cd601a6

---

title: Self-Healing DevOps Sandbox
emoji: 🔧
colorFrom: red
colorTo: green
sdk: docker
pinned: false
app_port: 8000
base_path: /web
tags:
  - openenv
---


# Self-Healing DevOps Sandbox

An OpenEnv RL environment where an AI agent is dropped into a **broken Node.js backend** inside a Docker container. The agent must use **bash commands only** to diagnose bugs, edit files, and fix the app -- just like a real DevOps engineer would.

Built for the **Meta PyTorch OpenEnv Hackathon**.

---

## What Is This?

A 3-task challenge of increasing difficulty. The agent starts in a Docker container with a broken Express.js app in `/app` and must make all endpoints healthy.

| # | Difficulty | Bug             | What's Wrong                          |
|---|-----------|-----------------|---------------------------------------|
| 1 | Easy      | `config.json`    | Port set to `9999` instead of `3000`  |
| 2 | Medium    | `routes/users.js`| Missing `)` causes SyntaxError crash  |
| 3 | Hard      | `routes/data.js` | Missing `await` causes HTTP 500       |

**Goal:** Fix all bugs so these endpoints return HTTP 200:
- `GET /health` returns `{"status": "ok"}`
- `GET /api/users` returns `{"users": [...]}`
- `GET /api/data` returns `{"records": [...]}`

---

## Scoring (Partial Rewards)

The grader runs **after every command** and awards cumulative points:

| Milestone                        | Points | Total    |
|----------------------------------|--------|----------|
| App starts on port 3000          | +0.35  | 0.35     |
| `/health` returns 200            | +0.10  | 0.45     |
| `/api/users` returns valid JSON  | +0.15  | 0.60     |
| `/api/data` returns valid JSON   | +0.25  | 0.85     |
| All endpoints correct            | +0.15  | **1.00** |

---

## Getting Started

### Prerequisites

- **Python 3.10+**
- **Docker Desktop** (running)
- **uv** package manager (`pip install uv`)

### 1. Install Dependencies

```bash

cd devops_sandbox

uv sync

```

### 2. Build the Sandbox Docker Image

```bash

docker build -t devops-sandbox-node:latest -f simulated_app/Dockerfile simulated_app/

```

### 3. Start the Environment Server

```bash

uv run server

```

The server starts at `http://localhost:8000`.

### 4. Run the Baseline Agent

In a **separate terminal**:

```bash

# Set your OpenAI API key

export OPENAI_API_KEY="sk-..."          # Linux/Mac

$env:OPENAI_API_KEY = "sk-..."          # PowerShell



# Run the baseline

uv run python baseline.py

```

---

## Test Your Own Agent

### Option A: Use the Python Client

```python

from devops_sandbox import BashAction, DevopsSandboxEnv



with DevopsSandboxEnv(base_url="http://localhost:8000").sync() as env:

    # Reset creates a fresh Docker container

    result = env.reset()

    print(result.observation.stdout)       # Task description

    print(result.observation.grader_score)  # 0.0



    # Send bash commands

    result = env.step(BashAction(command="cat /app/config.json"))

    print(result.observation.stdout)       # File contents

    print(result.observation.grader_score)  # Score after grading



    # Fix a bug

    result = env.step(BashAction(command="sed -i 's/9999/3000/' /app/config.json"))

    print(result.observation.grader_score)  # Partial score



    # Check if done

    if result.done:

        print("Episode complete!")

```

### Option B: Use the REST API Directly

```bash

# Reset the environment

curl -X POST http://localhost:8000/reset



# Send a command

curl -X POST http://localhost:8000/step \

  -H "Content-Type: application/json" \

  -d '{"action": {"command": "ls -la /app"}}'

```

### Option C: Use the WebSocket Endpoint

Connect to `ws://localhost:8000/ws` for persistent sessions.

---

## Project Structure

```

devops_sandbox/

|-- openenv.yaml                 # OpenEnv manifest

|-- pyproject.toml               # Python dependencies

|-- README.md                    # This file

|-- baseline.py                  # LLM-powered baseline agent

|-- models.py                    # BashAction & TerminalObservation schemas

|-- client.py                    # Python client for the environment

|

|-- server/

|   |-- app.py                   # FastAPI server (entry point)

|   +-- devops_sandbox_environment.py  # Environment logic + grader

|

+-- simulated_app/               # The broken Node.js app (Docker context)

    |-- Dockerfile               # node:20-slim sandbox container

    |-- package.json             # Express.js project

    |-- server.js                # Main entry point

    |-- config.json              # Bug 1: wrong port

    +-- routes/

        |-- users.js             # Bug 2: syntax error

        +-- data.js              # Bug 3: missing await

```

---

## How It Works

```

+-----------+   BashAction    +------------+   docker exec   +--------------+

|  Agent    | --------------> |  OpenEnv   | --------------> |  Docker      |

| (LLM/RL) |                 |  Server    |                 |  Container   |

|           | <-------------- |  (8000)    | <-------------- |  (broken app)|

+-----------+  Observation    +-----+------+   stdout/stderr +--------------+

               + grader_score       |

                              +-----+------+

                              |   Grader   |

                              | (curl test |

                              |  endpoints)|

                              +------------+

```

1. **Agent** sends a `BashAction` (e.g., `cat /app/config.json`)
2. **Server** runs it inside the Docker container via `docker exec`
3. **Grader** restarts the Node app and curls all endpoints
4. **Observation** returns: stdout, stderr, score (0.0-1.0), feedback

---

## Configuration

| Env Variable        | Default                  | Description                        |
|--------------------|--------------------------|------------------------------------|
| `OPENAI_API_KEY`    | *(required)*             | OpenAI API key for baseline        |
| `OPENAI_MODEL`      | `gpt-4o-mini`            | LLM model to use                   |
| `OPENAI_BASE_URL`   | *(OpenAI default)*       | Custom endpoint (Ollama, vLLM)     |
| `MAX_TURNS`         | `30`                     | Max steps per episode              |
| `DEVOPS_SANDBOX_URL`| `http://localhost:8000`  | Environment server URL             |

### Use with Local LLMs (Ollama, vLLM)

```bash

export OPENAI_BASE_URL="http://localhost:11434/v1"

export OPENAI_MODEL="llama3"

export OPENAI_API_KEY="dummy"

uv run python baseline.py

```

---

## Validation

```bash

uv run openenv validate

# Expected: [OK] devops_sandbox: Ready for multi-mode deployment

```

---

## License

BSD-style license. See LICENSE for details.