sentinel_env / USER_GUIDE.md
KaushikSarveswaran's picture
Initial submission: OpenEnv-Sentinel SRE triage environment
33dd3ee
# OpenEnv-Sentinel β€” User Guide
Step-by-step guide for running, validating, and deploying the SRE Incident Triage environment.
---
## Table of Contents
1. [Prerequisites](#1-prerequisites)
2. [Local Setup](#2-local-setup)
3. [Running the Server Locally](#3-running-the-server-locally)
4. [Manual Validation β€” Local Server](#4-manual-validation--local-server)
5. [Docker Build & Validation](#5-docker-build--validation)
6. [Running Inference (LLM Agent)](#6-running-inference-llm-agent)
7. [OpenEnv Validate](#7-openenv-validate)
8. [Deploy to Hugging Face Spaces](#8-deploy-to-hugging-face-spaces)
9. [Troubleshooting](#9-troubleshooting)
---
## 1. Prerequisites
| Tool | Version | Purpose |
|---|---|---|
| Python | β‰₯ 3.10 | Runtime |
| pip / pipenv | Latest | Dependency management |
| Docker | Latest | Container build & test |
| Git | Latest | Version control, HF push |
| huggingface-cli | Latest | HF Spaces deployment |
| openenv-core CLI | β‰₯ 0.2.3 | `openenv validate` / `openenv push` |
Install the OpenEnv CLI and Hugging Face CLI:
```bash
pip install openenv-core huggingface-hub[cli]
```
---
## 2. Local Setup
### Option A β€” pip (quick)
```bash
cd openenv-sentinel
pip install -e ".[dev,inference]"
```
### Option B β€” pipenv (isolated)
```bash
cd openenv-sentinel
pipenv install --python 3.12
pipenv install -e ".[dev]"
pipenv install openai httpx websockets
pipenv shell
```
All subsequent commands assume you are inside the virtual environment.
---
## 3. Running the Server Locally
Start the FastAPI server on port 8000:
```bash
uvicorn server.app:app --host 0.0.0.0 --port 8000
```
You should see:
```
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
```
Verify with:
```bash
curl http://localhost:8000/health
# β†’ {"status":"ok"}
curl http://localhost:8000/schema
# β†’ JSON with action, observation, state schemas
```
---
## 4. Manual Validation β€” Local Server
With the server running (from step 3), validate the environment in a **second terminal**.
### 4.1 Automated test script
The quickest way to validate all 3 tasks end-to-end:
```bash
pip install websockets httpx # if not already installed
python test_local.py
```
Expected output:
```
Health: {'status': 'ok'}
Schema: action fields=['tool_name', 'parameters']
==================================================
TASK 1
==================================================
Reset OK: CRITICAL: payment-api returning HTTP 500 errors...
Step 1 (status payment-api): reward=0.11
Step 2 (logs payment-api): reward=0.11
Resolution: score=0.75, done=True
State: final_score=1.0, root_cause_correct=True, recommendation_correct=True
... (Tasks 2 & 3 similar) ...
βœ… ALL TESTS PASSED
```
### 4.2 Manual cURL validation (HTTP endpoints)
> **Note:** HTTP endpoints are stateless β€” each request creates a fresh environment
> instance. Use these for single-shot checks only. For multi-step episodes, use
> the WebSocket endpoint (section 4.3).
**Health check:**
```bash
curl http://localhost:8000/health
```
**Schema check:**
```bash
curl http://localhost:8000/schema | python -m json.tool
```
**Reset (single-shot):**
```bash
curl -X POST http://localhost:8000/reset \
-H "Content-Type: application/json" \
-d '{"task_id": 1}'
```
### 4.3 Manual WebSocket validation (stateful sessions)
Multi-step episodes require WebSocket because the server maintains session state
across messages. Install `websocat` or use Python:
**Using Python interactively:**
```python
import asyncio, json, websockets
async def manual_test():
async with websockets.connect("ws://localhost:8000/ws") as ws:
# 1. Reset to Task 1
await ws.send(json.dumps({"type": "reset", "data": {"task_id": 1}}))
resp = json.loads(await ws.recv())
print("Reset:", json.dumps(resp["data"]["observation"]["incident_summary"]))
# 2. Call a diagnostic tool
await ws.send(json.dumps({
"type": "step",
"data": {
"tool_name": "get_service_status",
"parameters": {"service": "payment-api"}
}
}))
resp = json.loads(await ws.recv())
print("Step 1:", resp["data"]["observation"]["tool_output"][:200])
# 3. Submit resolution
await ws.send(json.dumps({
"type": "step",
"data": {
"tool_name": "submit_resolution",
"parameters": {
"root_cause": "Missing DB_CONNECTION_STRING after v2.3.1 deploy",
"affected_service": "payment-api",
"recommendation": "Rollback to v2.3.0 or set the env var"
}
}
}))
resp = json.loads(await ws.recv())
print("Done:", resp["data"]["done"], "Score:", resp["data"]["reward"])
# 4. Get final state
await ws.send(json.dumps({"type": "state"}))
resp = json.loads(await ws.recv())
print("Final score:", resp["data"]["final_score"])
asyncio.run(manual_test())
```
**Using websocat (CLI tool):**
```bash
brew install websocat # macOS
websocat ws://localhost:8000/ws
```
Then type JSON messages line by line:
```json
{"type": "reset", "data": {"task_id": 1}}
{"type": "step", "data": {"tool_name": "get_service_status", "parameters": {"service": "payment-api"}}}
{"type": "step", "data": {"tool_name": "submit_resolution", "parameters": {"root_cause": "Missing DB_CONNECTION_STRING", "affected_service": "payment-api", "recommendation": "Rollback to v2.3.0"}}}
{"type": "state"}
```
### 4.4 What to check
| Check | Expected |
|---|---|
| `/health` returns 200 | `{"status": "ok"}` |
| `/schema` returns action/observation/state schemas | Three top-level keys with JSON Schema properties |
| Reset with `task_id` 1, 2, 3 | Returns `incident_summary`, `available_tools` (7 tools), `done: false` |
| Diagnostic tool steps | Returns `tool_output` (non-empty), per-step `reward` |
| `submit_resolution` | Sets `done: true`, returns graded `reward` |
| State after resolution | `final_score` between 0.0–1.0, `root_cause_correct` bool |
| All 3 tasks produce scores > 0.0 with good resolutions | Task 1 β‰ˆ 1.0, Task 2 β‰ˆ 1.0, Task 3 β‰ˆ 1.0 (with ideal answers) |
---
## 5. Docker Build & Validation
### 5.1 Build the image
```bash
docker build -t sentinel-env:latest -f server/Dockerfile .
```
### 5.2 Run the container
```bash
docker run -p 8000:8000 sentinel-env:latest
```
The server starts on port 8000 inside the container, mapped to your host.
### 5.3 Validate against the container
Once the container is running, all the same validation steps from section 4 work:
```bash
# Health check
curl http://localhost:8000/health
# Run the automated test suite
python test_local.py
# Or run inference against the containerised server
ENV_URL=http://localhost:8000 python inference.py
```
### 5.4 Docker β€” useful commands
```bash
# Build with no cache (clean rebuild)
docker build --no-cache -t sentinel-env:latest -f server/Dockerfile .
# Run in background
docker run -d --name sentinel -p 8000:8000 sentinel-env:latest
# View logs
docker logs -f sentinel
# Stop and remove
docker stop sentinel && docker rm sentinel
# Check image size (should be < 500MB)
docker images sentinel-env
```
---
## 6. Running Inference (LLM Agent)
The inference script drives an LLM through all 3 tasks via WebSocket.
### 6.1 Set environment variables
The inference script supports **HF Inference** (default), **OpenAI**, and **Azure OpenAI** endpoints.
> **Important:** `ENV_URL` is the Sentinel environment server. `API_BASE_URL` is
> the LLM API endpoint (matching the official OpenEnv inference pattern).
**Option A β€” HF Inference API (default, for hackathon submission):**
```bash
export ENV_URL=http://localhost:8000 # env server
export API_BASE_URL=https://router.huggingface.co/v1 # default, can omit
export MODEL_NAME=openai/gpt-oss-120b:novita # default, can omit
export HF_TOKEN=hf_... # or API_KEY
```
**Option B β€” OpenAI:**
```bash
export ENV_URL=http://localhost:8000
export API_BASE_URL=https://api.openai.com/v1
export MODEL_NAME=gpt-4o
export API_KEY=sk-...
```
**Option C β€” Azure OpenAI (for local/enterprise testing):**
```bash
export ENV_URL=http://localhost:8000
export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
export AZURE_OPENAI_API_KEY=your-azure-key
export MODEL_NAME=your-deployment-name # Azure deployment name
export AZURE_OPENAI_API_VERSION=2024-12-01-preview # optional, this is the default
```
> When `AZURE_OPENAI_ENDPOINT` is set, the script uses `AzureOpenAI` client.
> Otherwise it uses `OpenAI(base_url=API_BASE_URL, api_key=...)` β€” which
> covers both HF router and direct OpenAI.
### 6.2 Install inference dependencies
```bash
pip install openai websockets
```
### 6.3 Run
```bash
python inference.py
```
Expected output:
```
==================================================
Running Task 1...
==================================================
Task 1: 0.85
==================================================
Running Task 2...
==================================================
Task 2: 0.65
==================================================
Running Task 3...
==================================================
Task 3: 0.40
==================================================
Task 1: 0.85
Task 2: 0.65
Task 3: 0.40
Average: 0.63
==================================================
```
### 6.4 Inference against a remote HF Space
```bash
# HF model via HF router (hackathon default)
export ENV_URL=https://your-username-sentinel-env.hf.space
export HF_TOKEN=hf_...
python inference.py
# OpenAI model
export ENV_URL=https://your-username-sentinel-env.hf.space
export API_BASE_URL=https://api.openai.com/v1
export MODEL_NAME=gpt-4o
export API_KEY=sk-...
python inference.py
# Azure OpenAI
export ENV_URL=https://your-username-sentinel-env.hf.space
export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
export AZURE_OPENAI_API_KEY=your-azure-key
export MODEL_NAME=your-deployment-name
python inference.py
```
---
## 7. OpenEnv Validate
Run the official OpenEnv validation to confirm spec compliance:
```bash
openenv validate
```
This checks:
- `openenv.yaml` manifest is valid
- The app entry point (`server.app:app`) is importable
- A `main()` function exists in the script entry point
- `uv.lock` is present and up to date
If `uv.lock` is missing or stale:
```bash
pip install uv
uv lock
openenv validate
```
---
## 8. Deploy to Hugging Face Spaces
### 8.1 Login to Hugging Face
```bash
huggingface-cli login
# Paste your HF token when prompted (needs write access)
```
### 8.2 Option A β€” `openenv push` (recommended)
```bash
openenv push
```
This reads `openenv.yaml` and pushes the environment as a Docker Space tagged
with `openenv`.
### 8.3 Option B β€” Manual HF Spaces deployment
**Step 1: Create the Space**
Go to https://huggingface.co/new-space and create a new Space:
- **Space name:** `sentinel-env` (or any name)
- **SDK:** Docker
- **Hardware:** CPU basic (2 vCPU, 16GB RAM β€” free tier)
- **Visibility:** Public
**Step 2: Clone the Space repo**
```bash
git clone https://huggingface.co/spaces/YOUR_USERNAME/sentinel-env hf-space
cd hf-space
```
**Step 3: Copy project files**
```bash
# Copy all source files
cp -r /path/to/openenv-sentinel/{models.py,__init__.py,client.py,inference.py} .
cp -r /path/to/openenv-sentinel/{server,scenarios,tools,grading} .
cp /path/to/openenv-sentinel/openenv.yaml .
cp /path/to/openenv-sentinel/pyproject.toml .
cp /path/to/openenv-sentinel/README.md .
# The Dockerfile must be at the repo root for HF Spaces
cp /path/to/openenv-sentinel/server/Dockerfile .
```
> **Important:** HF Spaces expects `Dockerfile` at the repository root. The
> COPY paths inside the Dockerfile already reference files relative to the
> build context (repo root), so no changes are needed.
**Step 4: Push to HF**
```bash
git add .
git commit -m "Deploy OpenEnv-Sentinel"
git push
```
**Step 5: Verify deployment**
The Space builds automatically. Once running:
```bash
curl https://YOUR_USERNAME-sentinel-env.hf.space/health
# β†’ {"status": "ok"}
```
### 8.4 Verify the deployed Space
```bash
# Health
curl https://YOUR_USERNAME-sentinel-env.hf.space/health
# Schema
curl https://YOUR_USERNAME-sentinel-env.hf.space/schema
# Run test_local.py against the Space (edit BASE_HTTP/BASE_WS in the file)
# Or run inference:
ENV_URL=https://YOUR_USERNAME-sentinel-env.hf.space \
HF_TOKEN=hf_... \
python inference.py
```
### 8.5 HF Spaces tips
- **Cold starts:** Free-tier Spaces sleep after inactivity. First request takes ~30s.
- **Logs:** View build & runtime logs in the Space's "Logs" tab on HF.
- **Environment variables:** Set secrets (like API keys) in Space Settings β†’ Repository secrets.
- **Tags:** Ensure the README frontmatter includes `tags: [openenv]` for hackathon discovery.
- **Port:** The `app_port: 8000` in README frontmatter must match the `EXPOSE` in the Dockerfile.
---
## 9. Troubleshooting
| Problem | Solution |
|---|---|
| `ModuleNotFoundError: No module named 'openenv'` | Run `pip install -e .` or `pip install openenv-core>=0.2.3` |
| `openenv validate` fails with "no main() found" | Ensure `server/app.py` has a `def main()` function and `[project.scripts]` in pyproject.toml |
| `openenv validate` fails with "uv.lock not found" | Run `pip install uv && uv lock` |
| WebSocket connection refused | Server must be running (`uvicorn server.app:app --port 8000`) |
| HTTP `/step` returns fresh state (not continuing episode) | HTTP endpoints are stateless. Use WebSocket `/ws` for multi-step episodes |
| Docker build fails on `COPY` | Run `docker build` from the project root (not from `server/`) |
| Docker healthcheck failing | Ensure `curl` is installed in the image (the Dockerfile does this) |
| `inference.py` error: "ENV_URL required" | `export ENV_URL=http://localhost:8000` |
| Azure OpenAI 401 / auth error | Verify `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_KEY`, and that `MODEL_NAME` matches your deployment name |
| HF Space shows "Building" forever | Check the Logs tab for build errors. Common: missing files in COPY |
| HF Space returns 502 | The app hasn't started yet (cold start) or crashed. Check runtime logs |
| Task score is 0.0 | The resolution keywords didn't match. Check grading criteria in HACKATHON_PLAN.md Β§5 |
| `websockets` not installed | `pip install websockets` |
---
## Quick Reference
```bash
# ── Local development ──
pip install -e ".[dev,inference]"
uvicorn server.app:app --port 8000 # start server
python test_local.py # validate all 3 tasks
openenv validate # check spec compliance
# ── Docker ──
docker build -t sentinel-env -f server/Dockerfile .
docker run -p 8000:8000 sentinel-env
# ── Inference (HF router β€” hackathon default) ──
export ENV_URL=http://localhost:8000
export HF_TOKEN=hf_...
python inference.py
# ── Inference (OpenAI) ──
export ENV_URL=http://localhost:8000
export API_BASE_URL=https://api.openai.com/v1
export MODEL_NAME=gpt-4o
export API_KEY=sk-...
python inference.py
# ── Inference (Azure OpenAI) ──
export ENV_URL=http://localhost:8000
export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
export AZURE_OPENAI_API_KEY=your-azure-key
export MODEL_NAME=your-deployment-name
python inference.py
# ── Deploy ──
huggingface-cli login
openenv push
```