Spaces:

NoNameFound
/

sentinel_env

Sleeping

App Files Files Community

sentinel_env / USER_GUIDE.md

KaushikSarveswaran

Initial submission: OpenEnv-Sentinel SRE triage environment

33dd3ee about 1 month ago

preview code

raw

history blame contribute delete

15.6 kB

	# OpenEnv-Sentinel — User Guide

	Step-by-step guide for running, validating, and deploying the SRE Incident Triage environment.

	---

	## Table of Contents

	1. [Prerequisites](#1-prerequisites)
	2. [Local Setup](#2-local-setup)
	3. [Running the Server Locally](#3-running-the-server-locally)
	4. [Manual Validation — Local Server](#4-manual-validation--local-server)
	5. [Docker Build & Validation](#5-docker-build--validation)
	6. [Running Inference (LLM Agent)](#6-running-inference-llm-agent)
	7. [OpenEnv Validate](#7-openenv-validate)
	8. [Deploy to Hugging Face Spaces](#8-deploy-to-hugging-face-spaces)
	9. [Troubleshooting](#9-troubleshooting)

	---

	## 1. Prerequisites

	\| Tool \| Version \| Purpose \|
	\|---\|---\|---\|
	\| Python \| ≥ 3.10 \| Runtime \|
	\| pip / pipenv \| Latest \| Dependency management \|
	\| Docker \| Latest \| Container build & test \|
	\| Git \| Latest \| Version control, HF push \|
	\| huggingface-cli \| Latest \| HF Spaces deployment \|
	\| openenv-core CLI \| ≥ 0.2.3 \| `openenv validate` / `openenv push` \|

	Install the OpenEnv CLI and Hugging Face CLI:

	```bash
	pip install openenv-core huggingface-hub[cli]
	```

	---

	## 2. Local Setup

	### Option A — pip (quick)

	```bash
	cd openenv-sentinel
	pip install -e ".[dev,inference]"
	```

	### Option B — pipenv (isolated)

	```bash
	cd openenv-sentinel
	pipenv install --python 3.12
	pipenv install -e ".[dev]"
	pipenv install openai httpx websockets
	pipenv shell
	```

	All subsequent commands assume you are inside the virtual environment.

	---

	## 3. Running the Server Locally

	Start the FastAPI server on port 8000:

	```bash
	uvicorn server.app:app --host 0.0.0.0 --port 8000
	```

	You should see:

	```
	INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
	```

	Verify with:

	```bash
	curl http://localhost:8000/health
	# → {"status":"ok"}

	curl http://localhost:8000/schema
	# → JSON with action, observation, state schemas
	```

	---

	## 4. Manual Validation — Local Server

	With the server running (from step 3), validate the environment in a second terminal.

	### 4.1 Automated test script

	The quickest way to validate all 3 tasks end-to-end:

	```bash
	pip install websockets httpx # if not already installed
	python test_local.py
	```

	Expected output:

	```
	Health: {'status': 'ok'}
	Schema: action fields=['tool_name', 'parameters']

	==================================================
	TASK 1
	==================================================
	Reset OK: CRITICAL: payment-api returning HTTP 500 errors...
	Step 1 (status payment-api): reward=0.11
	Step 2 (logs payment-api): reward=0.11
	Resolution: score=0.75, done=True
	State: final_score=1.0, root_cause_correct=True, recommendation_correct=True

	... (Tasks 2 & 3 similar) ...

	✅ ALL TESTS PASSED
	```

	### 4.2 Manual cURL validation (HTTP endpoints)

	> Note: HTTP endpoints are stateless — each request creates a fresh environment
	> instance. Use these for single-shot checks only. For multi-step episodes, use
	> the WebSocket endpoint (section 4.3).

	Health check:

	```bash
	curl http://localhost:8000/health
	```

	Schema check:

	```bash
	curl http://localhost:8000/schema \| python -m json.tool
	```

	Reset (single-shot):

	```bash
	curl -X POST http://localhost:8000/reset \
	-H "Content-Type: application/json" \
	-d '{"task_id": 1}'
	```

	### 4.3 Manual WebSocket validation (stateful sessions)

	Multi-step episodes require WebSocket because the server maintains session state
	across messages. Install `websocat` or use Python:

	Using Python interactively:

	```python
	import asyncio, json, websockets

	async def manual_test():
	async with websockets.connect("ws://localhost:8000/ws") as ws:
	# 1. Reset to Task 1
	await ws.send(json.dumps({"type": "reset", "data": {"task_id": 1}}))
	resp = json.loads(await ws.recv())
	print("Reset:", json.dumps(resp["data"]["observation"]["incident_summary"]))

	# 2. Call a diagnostic tool
	await ws.send(json.dumps({
	"type": "step",
	"data": {
	"tool_name": "get_service_status",
	"parameters": {"service": "payment-api"}
	}
	}))
	resp = json.loads(await ws.recv())
	print("Step 1:", resp["data"]["observation"]["tool_output"][:200])

	# 3. Submit resolution
	await ws.send(json.dumps({
	"type": "step",
	"data": {
	"tool_name": "submit_resolution",
	"parameters": {
	"root_cause": "Missing DB_CONNECTION_STRING after v2.3.1 deploy",
	"affected_service": "payment-api",
	"recommendation": "Rollback to v2.3.0 or set the env var"
	}
	}
	}))
	resp = json.loads(await ws.recv())
	print("Done:", resp["data"]["done"], "Score:", resp["data"]["reward"])

	# 4. Get final state
	await ws.send(json.dumps({"type": "state"}))
	resp = json.loads(await ws.recv())
	print("Final score:", resp["data"]["final_score"])

	asyncio.run(manual_test())
	```

	Using websocat (CLI tool):

	```bash
	brew install websocat # macOS
	websocat ws://localhost:8000/ws
	```

	Then type JSON messages line by line:

	```json
	{"type": "reset", "data": {"task_id": 1}}
	{"type": "step", "data": {"tool_name": "get_service_status", "parameters": {"service": "payment-api"}}}
	{"type": "step", "data": {"tool_name": "submit_resolution", "parameters": {"root_cause": "Missing DB_CONNECTION_STRING", "affected_service": "payment-api", "recommendation": "Rollback to v2.3.0"}}}
	{"type": "state"}
	```

	### 4.4 What to check

	\| Check \| Expected \|
	\|---\|---\|
	\| `/health` returns 200 \| `{"status": "ok"}` \|
	\| `/schema` returns action/observation/state schemas \| Three top-level keys with JSON Schema properties \|
	\| Reset with `task_id` 1, 2, 3 \| Returns `incident_summary`, `available_tools` (7 tools), `done: false` \|
	\| Diagnostic tool steps \| Returns `tool_output` (non-empty), per-step `reward` \|
	\| `submit_resolution` \| Sets `done: true`, returns graded `reward` \|
	\| State after resolution \| `final_score` between 0.0–1.0, `root_cause_correct` bool \|
	\| All 3 tasks produce scores > 0.0 with good resolutions \| Task 1 ≈ 1.0, Task 2 ≈ 1.0, Task 3 ≈ 1.0 (with ideal answers) \|

	---

	## 5. Docker Build & Validation

	### 5.1 Build the image

	```bash
	docker build -t sentinel-env:latest -f server/Dockerfile .
	```

	### 5.2 Run the container

	```bash
	docker run -p 8000:8000 sentinel-env:latest
	```

	The server starts on port 8000 inside the container, mapped to your host.

	### 5.3 Validate against the container

	Once the container is running, all the same validation steps from section 4 work:

	```bash
	# Health check
	curl http://localhost:8000/health

	# Run the automated test suite
	python test_local.py

	# Or run inference against the containerised server
	ENV_URL=http://localhost:8000 python inference.py
	```

	### 5.4 Docker — useful commands

	```bash
	# Build with no cache (clean rebuild)
	docker build --no-cache -t sentinel-env:latest -f server/Dockerfile .

	# Run in background
	docker run -d --name sentinel -p 8000:8000 sentinel-env:latest

	# View logs
	docker logs -f sentinel

	# Stop and remove
	docker stop sentinel && docker rm sentinel

	# Check image size (should be < 500MB)
	docker images sentinel-env
	```

	---

	## 6. Running Inference (LLM Agent)

	The inference script drives an LLM through all 3 tasks via WebSocket.

	### 6.1 Set environment variables

	The inference script supports HF Inference (default), OpenAI, and Azure OpenAI endpoints.

	> Important: `ENV_URL` is the Sentinel environment server. `API_BASE_URL` is
	> the LLM API endpoint (matching the official OpenEnv inference pattern).

	Option A — HF Inference API (default, for hackathon submission):

	```bash
	export ENV_URL=http://localhost:8000 # env server
	export API_BASE_URL=https://router.huggingface.co/v1 # default, can omit
	export MODEL_NAME=openai/gpt-oss-120b:novita # default, can omit
	export HF_TOKEN=hf_... # or API_KEY
	```

	Option B — OpenAI:

	```bash
	export ENV_URL=http://localhost:8000
	export API_BASE_URL=https://api.openai.com/v1
	export MODEL_NAME=gpt-4o
	export API_KEY=sk-...
	```

	Option C — Azure OpenAI (for local/enterprise testing):

	```bash
	export ENV_URL=http://localhost:8000
	export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
	export AZURE_OPENAI_API_KEY=your-azure-key
	export MODEL_NAME=your-deployment-name # Azure deployment name
	export AZURE_OPENAI_API_VERSION=2024-12-01-preview # optional, this is the default
	```

	> When `AZURE_OPENAI_ENDPOINT` is set, the script uses `AzureOpenAI` client.
	> Otherwise it uses `OpenAI(base_url=API_BASE_URL, api_key=...)` — which
	> covers both HF router and direct OpenAI.

	### 6.2 Install inference dependencies

	```bash
	pip install openai websockets
	```

	### 6.3 Run

	```bash
	python inference.py
	```

	Expected output:

	```
	==================================================
	Running Task 1...
	==================================================
	Task 1: 0.85

	==================================================
	Running Task 2...
	==================================================
	Task 2: 0.65

	==================================================
	Running Task 3...
	==================================================
	Task 3: 0.40

	==================================================
	Task 1: 0.85
	Task 2: 0.65
	Task 3: 0.40
	Average: 0.63
	==================================================
	```

	### 6.4 Inference against a remote HF Space

	```bash
	# HF model via HF router (hackathon default)
	export ENV_URL=https://your-username-sentinel-env.hf.space
	export HF_TOKEN=hf_...
	python inference.py

	# OpenAI model
	export ENV_URL=https://your-username-sentinel-env.hf.space
	export API_BASE_URL=https://api.openai.com/v1
	export MODEL_NAME=gpt-4o
	export API_KEY=sk-...
	python inference.py

	# Azure OpenAI
	export ENV_URL=https://your-username-sentinel-env.hf.space
	export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
	export AZURE_OPENAI_API_KEY=your-azure-key
	export MODEL_NAME=your-deployment-name
	python inference.py
	```

	---

	## 7. OpenEnv Validate

	Run the official OpenEnv validation to confirm spec compliance:

	```bash
	openenv validate
	```

	This checks:
	- `openenv.yaml` manifest is valid
	- The app entry point (`server.app:app`) is importable
	- A `main()` function exists in the script entry point
	- `uv.lock` is present and up to date

	If `uv.lock` is missing or stale:

	```bash
	pip install uv
	uv lock
	openenv validate
	```

	---

	## 8. Deploy to Hugging Face Spaces

	### 8.1 Login to Hugging Face

	```bash
	huggingface-cli login
	# Paste your HF token when prompted (needs write access)
	```

	### 8.2 Option A — `openenv push` (recommended)

	```bash
	openenv push
	```

	This reads `openenv.yaml` and pushes the environment as a Docker Space tagged
	with `openenv`.

	### 8.3 Option B — Manual HF Spaces deployment

	Step 1: Create the Space

	Go to https://huggingface.co/new-space and create a new Space:

	- Space name: `sentinel-env` (or any name)
	- SDK: Docker
	- Hardware: CPU basic (2 vCPU, 16GB RAM — free tier)
	- Visibility: Public

	Step 2: Clone the Space repo

	```bash
	git clone https://huggingface.co/spaces/YOUR_USERNAME/sentinel-env hf-space
	cd hf-space
	```

	Step 3: Copy project files

	```bash
	# Copy all source files
	cp -r /path/to/openenv-sentinel/{models.py,__init__.py,client.py,inference.py} .
	cp -r /path/to/openenv-sentinel/{server,scenarios,tools,grading} .
	cp /path/to/openenv-sentinel/openenv.yaml .
	cp /path/to/openenv-sentinel/pyproject.toml .
	cp /path/to/openenv-sentinel/README.md .

	# The Dockerfile must be at the repo root for HF Spaces
	cp /path/to/openenv-sentinel/server/Dockerfile .
	```

	> Important: HF Spaces expects `Dockerfile` at the repository root. The
	> COPY paths inside the Dockerfile already reference files relative to the
	> build context (repo root), so no changes are needed.

	Step 4: Push to HF

	```bash
	git add .
	git commit -m "Deploy OpenEnv-Sentinel"
	git push
	```

	Step 5: Verify deployment

	The Space builds automatically. Once running:

	```bash
	curl https://YOUR_USERNAME-sentinel-env.hf.space/health
	# → {"status": "ok"}
	```

	### 8.4 Verify the deployed Space

	```bash
	# Health
	curl https://YOUR_USERNAME-sentinel-env.hf.space/health

	# Schema
	curl https://YOUR_USERNAME-sentinel-env.hf.space/schema

	# Run test_local.py against the Space (edit BASE_HTTP/BASE_WS in the file)
	# Or run inference:
	ENV_URL=https://YOUR_USERNAME-sentinel-env.hf.space \
	HF_TOKEN=hf_... \
	python inference.py
	```

	### 8.5 HF Spaces tips

	- Cold starts: Free-tier Spaces sleep after inactivity. First request takes ~30s.
	- Logs: View build & runtime logs in the Space's "Logs" tab on HF.
	- Environment variables: Set secrets (like API keys) in Space Settings → Repository secrets.
	- Tags: Ensure the README frontmatter includes `tags: [openenv]` for hackathon discovery.
	- Port: The `app_port: 8000` in README frontmatter must match the `EXPOSE` in the Dockerfile.

	---

	## 9. Troubleshooting

	\| Problem \| Solution \|
	\|---\|---\|
	\| `ModuleNotFoundError: No module named 'openenv'` \| Run `pip install -e .` or `pip install openenv-core>=0.2.3` \|
	\| `openenv validate` fails with "no main() found" \| Ensure `server/app.py` has a `def main()` function and `[project.scripts]` in pyproject.toml \|
	\| `openenv validate` fails with "uv.lock not found" \| Run `pip install uv && uv lock` \|
	\| WebSocket connection refused \| Server must be running (`uvicorn server.app:app --port 8000`) \|
	\| HTTP `/step` returns fresh state (not continuing episode) \| HTTP endpoints are stateless. Use WebSocket `/ws` for multi-step episodes \|
	\| Docker build fails on `COPY` \| Run `docker build` from the project root (not from `server/`) \|
	\| Docker healthcheck failing \| Ensure `curl` is installed in the image (the Dockerfile does this) \|
	\| `inference.py` error: "ENV_URL required" \| `export ENV_URL=http://localhost:8000` \|
	\| Azure OpenAI 401 / auth error \| Verify `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_KEY`, and that `MODEL_NAME` matches your deployment name \|
	\| HF Space shows "Building" forever \| Check the Logs tab for build errors. Common: missing files in COPY \|
	\| HF Space returns 502 \| The app hasn't started yet (cold start) or crashed. Check runtime logs \|
	\| Task score is 0.0 \| The resolution keywords didn't match. Check grading criteria in HACKATHON_PLAN.md §5 \|
	\| `websockets` not installed \| `pip install websockets` \|

	---

	## Quick Reference

	```bash
	# ── Local development ──
	pip install -e ".[dev,inference]"
	uvicorn server.app:app --port 8000 # start server
	python test_local.py # validate all 3 tasks
	openenv validate # check spec compliance

	# ── Docker ──
	docker build -t sentinel-env -f server/Dockerfile .
	docker run -p 8000:8000 sentinel-env

	# ── Inference (HF router — hackathon default) ──
	export ENV_URL=http://localhost:8000
	export HF_TOKEN=hf_...
	python inference.py

	# ── Inference (OpenAI) ──
	export ENV_URL=http://localhost:8000
	export API_BASE_URL=https://api.openai.com/v1
	export MODEL_NAME=gpt-4o
	export API_KEY=sk-...
	python inference.py

	# ── Inference (Azure OpenAI) ──
	export ENV_URL=http://localhost:8000
	export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
	export AZURE_OPENAI_API_KEY=your-azure-key
	export MODEL_NAME=your-deployment-name
	python inference.py

	# ── Deploy ──
	huggingface-cli login
	openenv push
	```