Spaces:
Sleeping
Sleeping
| # OpenEnv-Sentinel β User Guide | |
| Step-by-step guide for running, validating, and deploying the SRE Incident Triage environment. | |
| --- | |
| ## Table of Contents | |
| 1. [Prerequisites](#1-prerequisites) | |
| 2. [Local Setup](#2-local-setup) | |
| 3. [Running the Server Locally](#3-running-the-server-locally) | |
| 4. [Manual Validation β Local Server](#4-manual-validation--local-server) | |
| 5. [Docker Build & Validation](#5-docker-build--validation) | |
| 6. [Running Inference (LLM Agent)](#6-running-inference-llm-agent) | |
| 7. [OpenEnv Validate](#7-openenv-validate) | |
| 8. [Deploy to Hugging Face Spaces](#8-deploy-to-hugging-face-spaces) | |
| 9. [Troubleshooting](#9-troubleshooting) | |
| --- | |
| ## 1. Prerequisites | |
| | Tool | Version | Purpose | | |
| |---|---|---| | |
| | Python | β₯ 3.10 | Runtime | | |
| | pip / pipenv | Latest | Dependency management | | |
| | Docker | Latest | Container build & test | | |
| | Git | Latest | Version control, HF push | | |
| | huggingface-cli | Latest | HF Spaces deployment | | |
| | openenv-core CLI | β₯ 0.2.3 | `openenv validate` / `openenv push` | | |
| Install the OpenEnv CLI and Hugging Face CLI: | |
| ```bash | |
| pip install openenv-core huggingface-hub[cli] | |
| ``` | |
| --- | |
| ## 2. Local Setup | |
| ### Option A β pip (quick) | |
| ```bash | |
| cd openenv-sentinel | |
| pip install -e ".[dev,inference]" | |
| ``` | |
| ### Option B β pipenv (isolated) | |
| ```bash | |
| cd openenv-sentinel | |
| pipenv install --python 3.12 | |
| pipenv install -e ".[dev]" | |
| pipenv install openai httpx websockets | |
| pipenv shell | |
| ``` | |
| All subsequent commands assume you are inside the virtual environment. | |
| --- | |
| ## 3. Running the Server Locally | |
| Start the FastAPI server on port 8000: | |
| ```bash | |
| uvicorn server.app:app --host 0.0.0.0 --port 8000 | |
| ``` | |
| You should see: | |
| ``` | |
| INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit) | |
| ``` | |
| Verify with: | |
| ```bash | |
| curl http://localhost:8000/health | |
| # β {"status":"ok"} | |
| curl http://localhost:8000/schema | |
| # β JSON with action, observation, state schemas | |
| ``` | |
| --- | |
| ## 4. Manual Validation β Local Server | |
| With the server running (from step 3), validate the environment in a **second terminal**. | |
| ### 4.1 Automated test script | |
| The quickest way to validate all 3 tasks end-to-end: | |
| ```bash | |
| pip install websockets httpx # if not already installed | |
| python test_local.py | |
| ``` | |
| Expected output: | |
| ``` | |
| Health: {'status': 'ok'} | |
| Schema: action fields=['tool_name', 'parameters'] | |
| ================================================== | |
| TASK 1 | |
| ================================================== | |
| Reset OK: CRITICAL: payment-api returning HTTP 500 errors... | |
| Step 1 (status payment-api): reward=0.11 | |
| Step 2 (logs payment-api): reward=0.11 | |
| Resolution: score=0.75, done=True | |
| State: final_score=1.0, root_cause_correct=True, recommendation_correct=True | |
| ... (Tasks 2 & 3 similar) ... | |
| β ALL TESTS PASSED | |
| ``` | |
| ### 4.2 Manual cURL validation (HTTP endpoints) | |
| > **Note:** HTTP endpoints are stateless β each request creates a fresh environment | |
| > instance. Use these for single-shot checks only. For multi-step episodes, use | |
| > the WebSocket endpoint (section 4.3). | |
| **Health check:** | |
| ```bash | |
| curl http://localhost:8000/health | |
| ``` | |
| **Schema check:** | |
| ```bash | |
| curl http://localhost:8000/schema | python -m json.tool | |
| ``` | |
| **Reset (single-shot):** | |
| ```bash | |
| curl -X POST http://localhost:8000/reset \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"task_id": 1}' | |
| ``` | |
| ### 4.3 Manual WebSocket validation (stateful sessions) | |
| Multi-step episodes require WebSocket because the server maintains session state | |
| across messages. Install `websocat` or use Python: | |
| **Using Python interactively:** | |
| ```python | |
| import asyncio, json, websockets | |
| async def manual_test(): | |
| async with websockets.connect("ws://localhost:8000/ws") as ws: | |
| # 1. Reset to Task 1 | |
| await ws.send(json.dumps({"type": "reset", "data": {"task_id": 1}})) | |
| resp = json.loads(await ws.recv()) | |
| print("Reset:", json.dumps(resp["data"]["observation"]["incident_summary"])) | |
| # 2. Call a diagnostic tool | |
| await ws.send(json.dumps({ | |
| "type": "step", | |
| "data": { | |
| "tool_name": "get_service_status", | |
| "parameters": {"service": "payment-api"} | |
| } | |
| })) | |
| resp = json.loads(await ws.recv()) | |
| print("Step 1:", resp["data"]["observation"]["tool_output"][:200]) | |
| # 3. Submit resolution | |
| await ws.send(json.dumps({ | |
| "type": "step", | |
| "data": { | |
| "tool_name": "submit_resolution", | |
| "parameters": { | |
| "root_cause": "Missing DB_CONNECTION_STRING after v2.3.1 deploy", | |
| "affected_service": "payment-api", | |
| "recommendation": "Rollback to v2.3.0 or set the env var" | |
| } | |
| } | |
| })) | |
| resp = json.loads(await ws.recv()) | |
| print("Done:", resp["data"]["done"], "Score:", resp["data"]["reward"]) | |
| # 4. Get final state | |
| await ws.send(json.dumps({"type": "state"})) | |
| resp = json.loads(await ws.recv()) | |
| print("Final score:", resp["data"]["final_score"]) | |
| asyncio.run(manual_test()) | |
| ``` | |
| **Using websocat (CLI tool):** | |
| ```bash | |
| brew install websocat # macOS | |
| websocat ws://localhost:8000/ws | |
| ``` | |
| Then type JSON messages line by line: | |
| ```json | |
| {"type": "reset", "data": {"task_id": 1}} | |
| {"type": "step", "data": {"tool_name": "get_service_status", "parameters": {"service": "payment-api"}}} | |
| {"type": "step", "data": {"tool_name": "submit_resolution", "parameters": {"root_cause": "Missing DB_CONNECTION_STRING", "affected_service": "payment-api", "recommendation": "Rollback to v2.3.0"}}} | |
| {"type": "state"} | |
| ``` | |
| ### 4.4 What to check | |
| | Check | Expected | | |
| |---|---| | |
| | `/health` returns 200 | `{"status": "ok"}` | | |
| | `/schema` returns action/observation/state schemas | Three top-level keys with JSON Schema properties | | |
| | Reset with `task_id` 1, 2, 3 | Returns `incident_summary`, `available_tools` (7 tools), `done: false` | | |
| | Diagnostic tool steps | Returns `tool_output` (non-empty), per-step `reward` | | |
| | `submit_resolution` | Sets `done: true`, returns graded `reward` | | |
| | State after resolution | `final_score` between 0.0β1.0, `root_cause_correct` bool | | |
| | All 3 tasks produce scores > 0.0 with good resolutions | Task 1 β 1.0, Task 2 β 1.0, Task 3 β 1.0 (with ideal answers) | | |
| --- | |
| ## 5. Docker Build & Validation | |
| ### 5.1 Build the image | |
| ```bash | |
| docker build -t sentinel-env:latest -f server/Dockerfile . | |
| ``` | |
| ### 5.2 Run the container | |
| ```bash | |
| docker run -p 8000:8000 sentinel-env:latest | |
| ``` | |
| The server starts on port 8000 inside the container, mapped to your host. | |
| ### 5.3 Validate against the container | |
| Once the container is running, all the same validation steps from section 4 work: | |
| ```bash | |
| # Health check | |
| curl http://localhost:8000/health | |
| # Run the automated test suite | |
| python test_local.py | |
| # Or run inference against the containerised server | |
| ENV_URL=http://localhost:8000 python inference.py | |
| ``` | |
| ### 5.4 Docker β useful commands | |
| ```bash | |
| # Build with no cache (clean rebuild) | |
| docker build --no-cache -t sentinel-env:latest -f server/Dockerfile . | |
| # Run in background | |
| docker run -d --name sentinel -p 8000:8000 sentinel-env:latest | |
| # View logs | |
| docker logs -f sentinel | |
| # Stop and remove | |
| docker stop sentinel && docker rm sentinel | |
| # Check image size (should be < 500MB) | |
| docker images sentinel-env | |
| ``` | |
| --- | |
| ## 6. Running Inference (LLM Agent) | |
| The inference script drives an LLM through all 3 tasks via WebSocket. | |
| ### 6.1 Set environment variables | |
| The inference script supports **HF Inference** (default), **OpenAI**, and **Azure OpenAI** endpoints. | |
| > **Important:** `ENV_URL` is the Sentinel environment server. `API_BASE_URL` is | |
| > the LLM API endpoint (matching the official OpenEnv inference pattern). | |
| **Option A β HF Inference API (default, for hackathon submission):** | |
| ```bash | |
| export ENV_URL=http://localhost:8000 # env server | |
| export API_BASE_URL=https://router.huggingface.co/v1 # default, can omit | |
| export MODEL_NAME=openai/gpt-oss-120b:novita # default, can omit | |
| export HF_TOKEN=hf_... # or API_KEY | |
| ``` | |
| **Option B β OpenAI:** | |
| ```bash | |
| export ENV_URL=http://localhost:8000 | |
| export API_BASE_URL=https://api.openai.com/v1 | |
| export MODEL_NAME=gpt-4o | |
| export API_KEY=sk-... | |
| ``` | |
| **Option C β Azure OpenAI (for local/enterprise testing):** | |
| ```bash | |
| export ENV_URL=http://localhost:8000 | |
| export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com | |
| export AZURE_OPENAI_API_KEY=your-azure-key | |
| export MODEL_NAME=your-deployment-name # Azure deployment name | |
| export AZURE_OPENAI_API_VERSION=2024-12-01-preview # optional, this is the default | |
| ``` | |
| > When `AZURE_OPENAI_ENDPOINT` is set, the script uses `AzureOpenAI` client. | |
| > Otherwise it uses `OpenAI(base_url=API_BASE_URL, api_key=...)` β which | |
| > covers both HF router and direct OpenAI. | |
| ### 6.2 Install inference dependencies | |
| ```bash | |
| pip install openai websockets | |
| ``` | |
| ### 6.3 Run | |
| ```bash | |
| python inference.py | |
| ``` | |
| Expected output: | |
| ``` | |
| ================================================== | |
| Running Task 1... | |
| ================================================== | |
| Task 1: 0.85 | |
| ================================================== | |
| Running Task 2... | |
| ================================================== | |
| Task 2: 0.65 | |
| ================================================== | |
| Running Task 3... | |
| ================================================== | |
| Task 3: 0.40 | |
| ================================================== | |
| Task 1: 0.85 | |
| Task 2: 0.65 | |
| Task 3: 0.40 | |
| Average: 0.63 | |
| ================================================== | |
| ``` | |
| ### 6.4 Inference against a remote HF Space | |
| ```bash | |
| # HF model via HF router (hackathon default) | |
| export ENV_URL=https://your-username-sentinel-env.hf.space | |
| export HF_TOKEN=hf_... | |
| python inference.py | |
| # OpenAI model | |
| export ENV_URL=https://your-username-sentinel-env.hf.space | |
| export API_BASE_URL=https://api.openai.com/v1 | |
| export MODEL_NAME=gpt-4o | |
| export API_KEY=sk-... | |
| python inference.py | |
| # Azure OpenAI | |
| export ENV_URL=https://your-username-sentinel-env.hf.space | |
| export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com | |
| export AZURE_OPENAI_API_KEY=your-azure-key | |
| export MODEL_NAME=your-deployment-name | |
| python inference.py | |
| ``` | |
| --- | |
| ## 7. OpenEnv Validate | |
| Run the official OpenEnv validation to confirm spec compliance: | |
| ```bash | |
| openenv validate | |
| ``` | |
| This checks: | |
| - `openenv.yaml` manifest is valid | |
| - The app entry point (`server.app:app`) is importable | |
| - A `main()` function exists in the script entry point | |
| - `uv.lock` is present and up to date | |
| If `uv.lock` is missing or stale: | |
| ```bash | |
| pip install uv | |
| uv lock | |
| openenv validate | |
| ``` | |
| --- | |
| ## 8. Deploy to Hugging Face Spaces | |
| ### 8.1 Login to Hugging Face | |
| ```bash | |
| huggingface-cli login | |
| # Paste your HF token when prompted (needs write access) | |
| ``` | |
| ### 8.2 Option A β `openenv push` (recommended) | |
| ```bash | |
| openenv push | |
| ``` | |
| This reads `openenv.yaml` and pushes the environment as a Docker Space tagged | |
| with `openenv`. | |
| ### 8.3 Option B β Manual HF Spaces deployment | |
| **Step 1: Create the Space** | |
| Go to https://huggingface.co/new-space and create a new Space: | |
| - **Space name:** `sentinel-env` (or any name) | |
| - **SDK:** Docker | |
| - **Hardware:** CPU basic (2 vCPU, 16GB RAM β free tier) | |
| - **Visibility:** Public | |
| **Step 2: Clone the Space repo** | |
| ```bash | |
| git clone https://huggingface.co/spaces/YOUR_USERNAME/sentinel-env hf-space | |
| cd hf-space | |
| ``` | |
| **Step 3: Copy project files** | |
| ```bash | |
| # Copy all source files | |
| cp -r /path/to/openenv-sentinel/{models.py,__init__.py,client.py,inference.py} . | |
| cp -r /path/to/openenv-sentinel/{server,scenarios,tools,grading} . | |
| cp /path/to/openenv-sentinel/openenv.yaml . | |
| cp /path/to/openenv-sentinel/pyproject.toml . | |
| cp /path/to/openenv-sentinel/README.md . | |
| # The Dockerfile must be at the repo root for HF Spaces | |
| cp /path/to/openenv-sentinel/server/Dockerfile . | |
| ``` | |
| > **Important:** HF Spaces expects `Dockerfile` at the repository root. The | |
| > COPY paths inside the Dockerfile already reference files relative to the | |
| > build context (repo root), so no changes are needed. | |
| **Step 4: Push to HF** | |
| ```bash | |
| git add . | |
| git commit -m "Deploy OpenEnv-Sentinel" | |
| git push | |
| ``` | |
| **Step 5: Verify deployment** | |
| The Space builds automatically. Once running: | |
| ```bash | |
| curl https://YOUR_USERNAME-sentinel-env.hf.space/health | |
| # β {"status": "ok"} | |
| ``` | |
| ### 8.4 Verify the deployed Space | |
| ```bash | |
| # Health | |
| curl https://YOUR_USERNAME-sentinel-env.hf.space/health | |
| # Schema | |
| curl https://YOUR_USERNAME-sentinel-env.hf.space/schema | |
| # Run test_local.py against the Space (edit BASE_HTTP/BASE_WS in the file) | |
| # Or run inference: | |
| ENV_URL=https://YOUR_USERNAME-sentinel-env.hf.space \ | |
| HF_TOKEN=hf_... \ | |
| python inference.py | |
| ``` | |
| ### 8.5 HF Spaces tips | |
| - **Cold starts:** Free-tier Spaces sleep after inactivity. First request takes ~30s. | |
| - **Logs:** View build & runtime logs in the Space's "Logs" tab on HF. | |
| - **Environment variables:** Set secrets (like API keys) in Space Settings β Repository secrets. | |
| - **Tags:** Ensure the README frontmatter includes `tags: [openenv]` for hackathon discovery. | |
| - **Port:** The `app_port: 8000` in README frontmatter must match the `EXPOSE` in the Dockerfile. | |
| --- | |
| ## 9. Troubleshooting | |
| | Problem | Solution | | |
| |---|---| | |
| | `ModuleNotFoundError: No module named 'openenv'` | Run `pip install -e .` or `pip install openenv-core>=0.2.3` | | |
| | `openenv validate` fails with "no main() found" | Ensure `server/app.py` has a `def main()` function and `[project.scripts]` in pyproject.toml | | |
| | `openenv validate` fails with "uv.lock not found" | Run `pip install uv && uv lock` | | |
| | WebSocket connection refused | Server must be running (`uvicorn server.app:app --port 8000`) | | |
| | HTTP `/step` returns fresh state (not continuing episode) | HTTP endpoints are stateless. Use WebSocket `/ws` for multi-step episodes | | |
| | Docker build fails on `COPY` | Run `docker build` from the project root (not from `server/`) | | |
| | Docker healthcheck failing | Ensure `curl` is installed in the image (the Dockerfile does this) | | |
| | `inference.py` error: "ENV_URL required" | `export ENV_URL=http://localhost:8000` | | |
| | Azure OpenAI 401 / auth error | Verify `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_KEY`, and that `MODEL_NAME` matches your deployment name | | |
| | HF Space shows "Building" forever | Check the Logs tab for build errors. Common: missing files in COPY | | |
| | HF Space returns 502 | The app hasn't started yet (cold start) or crashed. Check runtime logs | | |
| | Task score is 0.0 | The resolution keywords didn't match. Check grading criteria in HACKATHON_PLAN.md Β§5 | | |
| | `websockets` not installed | `pip install websockets` | | |
| --- | |
| ## Quick Reference | |
| ```bash | |
| # ββ Local development ββ | |
| pip install -e ".[dev,inference]" | |
| uvicorn server.app:app --port 8000 # start server | |
| python test_local.py # validate all 3 tasks | |
| openenv validate # check spec compliance | |
| # ββ Docker ββ | |
| docker build -t sentinel-env -f server/Dockerfile . | |
| docker run -p 8000:8000 sentinel-env | |
| # ββ Inference (HF router β hackathon default) ββ | |
| export ENV_URL=http://localhost:8000 | |
| export HF_TOKEN=hf_... | |
| python inference.py | |
| # ββ Inference (OpenAI) ββ | |
| export ENV_URL=http://localhost:8000 | |
| export API_BASE_URL=https://api.openai.com/v1 | |
| export MODEL_NAME=gpt-4o | |
| export API_KEY=sk-... | |
| python inference.py | |
| # ββ Inference (Azure OpenAI) ββ | |
| export ENV_URL=http://localhost:8000 | |
| export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com | |
| export AZURE_OPENAI_API_KEY=your-azure-key | |
| export MODEL_NAME=your-deployment-name | |
| python inference.py | |
| # ββ Deploy ββ | |
| huggingface-cli login | |
| openenv push | |
| ``` | |