Spaces:
Sleeping
OpenEnv-Sentinel β User Guide
Step-by-step guide for running, validating, and deploying the SRE Incident Triage environment.
Table of Contents
- Prerequisites
- Local Setup
- Running the Server Locally
- Manual Validation β Local Server
- Docker Build & Validation
- Running Inference (LLM Agent)
- OpenEnv Validate
- Deploy to Hugging Face Spaces
- Troubleshooting
1. Prerequisites
| Tool | Version | Purpose |
|---|---|---|
| Python | β₯ 3.10 | Runtime |
| pip / pipenv | Latest | Dependency management |
| Docker | Latest | Container build & test |
| Git | Latest | Version control, HF push |
| huggingface-cli | Latest | HF Spaces deployment |
| openenv-core CLI | β₯ 0.2.3 | openenv validate / openenv push |
Install the OpenEnv CLI and Hugging Face CLI:
pip install openenv-core huggingface-hub[cli]
2. Local Setup
Option A β pip (quick)
cd openenv-sentinel
pip install -e ".[dev,inference]"
Option B β pipenv (isolated)
cd openenv-sentinel
pipenv install --python 3.12
pipenv install -e ".[dev]"
pipenv install openai httpx websockets
pipenv shell
All subsequent commands assume you are inside the virtual environment.
3. Running the Server Locally
Start the FastAPI server on port 8000:
uvicorn server.app:app --host 0.0.0.0 --port 8000
You should see:
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
Verify with:
curl http://localhost:8000/health
# β {"status":"ok"}
curl http://localhost:8000/schema
# β JSON with action, observation, state schemas
4. Manual Validation β Local Server
With the server running (from step 3), validate the environment in a second terminal.
4.1 Automated test script
The quickest way to validate all 3 tasks end-to-end:
pip install websockets httpx # if not already installed
python test_local.py
Expected output:
Health: {'status': 'ok'}
Schema: action fields=['tool_name', 'parameters']
==================================================
TASK 1
==================================================
Reset OK: CRITICAL: payment-api returning HTTP 500 errors...
Step 1 (status payment-api): reward=0.11
Step 2 (logs payment-api): reward=0.11
Resolution: score=0.75, done=True
State: final_score=1.0, root_cause_correct=True, recommendation_correct=True
... (Tasks 2 & 3 similar) ...
β
ALL TESTS PASSED
4.2 Manual cURL validation (HTTP endpoints)
Note: HTTP endpoints are stateless β each request creates a fresh environment instance. Use these for single-shot checks only. For multi-step episodes, use the WebSocket endpoint (section 4.3).
Health check:
curl http://localhost:8000/health
Schema check:
curl http://localhost:8000/schema | python -m json.tool
Reset (single-shot):
curl -X POST http://localhost:8000/reset \
-H "Content-Type: application/json" \
-d '{"task_id": 1}'
4.3 Manual WebSocket validation (stateful sessions)
Multi-step episodes require WebSocket because the server maintains session state
across messages. Install websocat or use Python:
Using Python interactively:
import asyncio, json, websockets
async def manual_test():
async with websockets.connect("ws://localhost:8000/ws") as ws:
# 1. Reset to Task 1
await ws.send(json.dumps({"type": "reset", "data": {"task_id": 1}}))
resp = json.loads(await ws.recv())
print("Reset:", json.dumps(resp["data"]["observation"]["incident_summary"]))
# 2. Call a diagnostic tool
await ws.send(json.dumps({
"type": "step",
"data": {
"tool_name": "get_service_status",
"parameters": {"service": "payment-api"}
}
}))
resp = json.loads(await ws.recv())
print("Step 1:", resp["data"]["observation"]["tool_output"][:200])
# 3. Submit resolution
await ws.send(json.dumps({
"type": "step",
"data": {
"tool_name": "submit_resolution",
"parameters": {
"root_cause": "Missing DB_CONNECTION_STRING after v2.3.1 deploy",
"affected_service": "payment-api",
"recommendation": "Rollback to v2.3.0 or set the env var"
}
}
}))
resp = json.loads(await ws.recv())
print("Done:", resp["data"]["done"], "Score:", resp["data"]["reward"])
# 4. Get final state
await ws.send(json.dumps({"type": "state"}))
resp = json.loads(await ws.recv())
print("Final score:", resp["data"]["final_score"])
asyncio.run(manual_test())
Using websocat (CLI tool):
brew install websocat # macOS
websocat ws://localhost:8000/ws
Then type JSON messages line by line:
{"type": "reset", "data": {"task_id": 1}}
{"type": "step", "data": {"tool_name": "get_service_status", "parameters": {"service": "payment-api"}}}
{"type": "step", "data": {"tool_name": "submit_resolution", "parameters": {"root_cause": "Missing DB_CONNECTION_STRING", "affected_service": "payment-api", "recommendation": "Rollback to v2.3.0"}}}
{"type": "state"}
4.4 What to check
| Check | Expected |
|---|---|
/health returns 200 |
{"status": "ok"} |
/schema returns action/observation/state schemas |
Three top-level keys with JSON Schema properties |
Reset with task_id 1, 2, 3 |
Returns incident_summary, available_tools (7 tools), done: false |
| Diagnostic tool steps | Returns tool_output (non-empty), per-step reward |
submit_resolution |
Sets done: true, returns graded reward |
| State after resolution | final_score between 0.0β1.0, root_cause_correct bool |
| All 3 tasks produce scores > 0.0 with good resolutions | Task 1 β 1.0, Task 2 β 1.0, Task 3 β 1.0 (with ideal answers) |
5. Docker Build & Validation
5.1 Build the image
docker build -t sentinel-env:latest -f server/Dockerfile .
5.2 Run the container
docker run -p 8000:8000 sentinel-env:latest
The server starts on port 8000 inside the container, mapped to your host.
5.3 Validate against the container
Once the container is running, all the same validation steps from section 4 work:
# Health check
curl http://localhost:8000/health
# Run the automated test suite
python test_local.py
# Or run inference against the containerised server
ENV_URL=http://localhost:8000 python inference.py
5.4 Docker β useful commands
# Build with no cache (clean rebuild)
docker build --no-cache -t sentinel-env:latest -f server/Dockerfile .
# Run in background
docker run -d --name sentinel -p 8000:8000 sentinel-env:latest
# View logs
docker logs -f sentinel
# Stop and remove
docker stop sentinel && docker rm sentinel
# Check image size (should be < 500MB)
docker images sentinel-env
6. Running Inference (LLM Agent)
The inference script drives an LLM through all 3 tasks via WebSocket.
6.1 Set environment variables
The inference script supports HF Inference (default), OpenAI, and Azure OpenAI endpoints.
Important:
ENV_URLis the Sentinel environment server.API_BASE_URLis the LLM API endpoint (matching the official OpenEnv inference pattern).
Option A β HF Inference API (default, for hackathon submission):
export ENV_URL=http://localhost:8000 # env server
export API_BASE_URL=https://router.huggingface.co/v1 # default, can omit
export MODEL_NAME=openai/gpt-oss-120b:novita # default, can omit
export HF_TOKEN=hf_... # or API_KEY
Option B β OpenAI:
export ENV_URL=http://localhost:8000
export API_BASE_URL=https://api.openai.com/v1
export MODEL_NAME=gpt-4o
export API_KEY=sk-...
Option C β Azure OpenAI (for local/enterprise testing):
export ENV_URL=http://localhost:8000
export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
export AZURE_OPENAI_API_KEY=your-azure-key
export MODEL_NAME=your-deployment-name # Azure deployment name
export AZURE_OPENAI_API_VERSION=2024-12-01-preview # optional, this is the default
When
AZURE_OPENAI_ENDPOINTis set, the script usesAzureOpenAIclient. Otherwise it usesOpenAI(base_url=API_BASE_URL, api_key=...)β which covers both HF router and direct OpenAI.
6.2 Install inference dependencies
pip install openai websockets
6.3 Run
python inference.py
Expected output:
==================================================
Running Task 1...
==================================================
Task 1: 0.85
==================================================
Running Task 2...
==================================================
Task 2: 0.65
==================================================
Running Task 3...
==================================================
Task 3: 0.40
==================================================
Task 1: 0.85
Task 2: 0.65
Task 3: 0.40
Average: 0.63
==================================================
6.4 Inference against a remote HF Space
# HF model via HF router (hackathon default)
export ENV_URL=https://your-username-sentinel-env.hf.space
export HF_TOKEN=hf_...
python inference.py
# OpenAI model
export ENV_URL=https://your-username-sentinel-env.hf.space
export API_BASE_URL=https://api.openai.com/v1
export MODEL_NAME=gpt-4o
export API_KEY=sk-...
python inference.py
# Azure OpenAI
export ENV_URL=https://your-username-sentinel-env.hf.space
export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
export AZURE_OPENAI_API_KEY=your-azure-key
export MODEL_NAME=your-deployment-name
python inference.py
7. OpenEnv Validate
Run the official OpenEnv validation to confirm spec compliance:
openenv validate
This checks:
openenv.yamlmanifest is valid- The app entry point (
server.app:app) is importable - A
main()function exists in the script entry point uv.lockis present and up to date
If uv.lock is missing or stale:
pip install uv
uv lock
openenv validate
8. Deploy to Hugging Face Spaces
8.1 Login to Hugging Face
huggingface-cli login
# Paste your HF token when prompted (needs write access)
8.2 Option A β openenv push (recommended)
openenv push
This reads openenv.yaml and pushes the environment as a Docker Space tagged
with openenv.
8.3 Option B β Manual HF Spaces deployment
Step 1: Create the Space
Go to https://huggingface.co/new-space and create a new Space:
- Space name:
sentinel-env(or any name) - SDK: Docker
- Hardware: CPU basic (2 vCPU, 16GB RAM β free tier)
- Visibility: Public
Step 2: Clone the Space repo
git clone https://huggingface.co/spaces/YOUR_USERNAME/sentinel-env hf-space
cd hf-space
Step 3: Copy project files
# Copy all source files
cp -r /path/to/openenv-sentinel/{models.py,__init__.py,client.py,inference.py} .
cp -r /path/to/openenv-sentinel/{server,scenarios,tools,grading} .
cp /path/to/openenv-sentinel/openenv.yaml .
cp /path/to/openenv-sentinel/pyproject.toml .
cp /path/to/openenv-sentinel/README.md .
# The Dockerfile must be at the repo root for HF Spaces
cp /path/to/openenv-sentinel/server/Dockerfile .
Important: HF Spaces expects
Dockerfileat the repository root. The COPY paths inside the Dockerfile already reference files relative to the build context (repo root), so no changes are needed.
Step 4: Push to HF
git add .
git commit -m "Deploy OpenEnv-Sentinel"
git push
Step 5: Verify deployment
The Space builds automatically. Once running:
curl https://YOUR_USERNAME-sentinel-env.hf.space/health
# β {"status": "ok"}
8.4 Verify the deployed Space
# Health
curl https://YOUR_USERNAME-sentinel-env.hf.space/health
# Schema
curl https://YOUR_USERNAME-sentinel-env.hf.space/schema
# Run test_local.py against the Space (edit BASE_HTTP/BASE_WS in the file)
# Or run inference:
ENV_URL=https://YOUR_USERNAME-sentinel-env.hf.space \
HF_TOKEN=hf_... \
python inference.py
8.5 HF Spaces tips
- Cold starts: Free-tier Spaces sleep after inactivity. First request takes ~30s.
- Logs: View build & runtime logs in the Space's "Logs" tab on HF.
- Environment variables: Set secrets (like API keys) in Space Settings β Repository secrets.
- Tags: Ensure the README frontmatter includes
tags: [openenv]for hackathon discovery. - Port: The
app_port: 8000in README frontmatter must match theEXPOSEin the Dockerfile.
9. Troubleshooting
| Problem | Solution |
|---|---|
ModuleNotFoundError: No module named 'openenv' |
Run pip install -e . or pip install openenv-core>=0.2.3 |
openenv validate fails with "no main() found" |
Ensure server/app.py has a def main() function and [project.scripts] in pyproject.toml |
openenv validate fails with "uv.lock not found" |
Run pip install uv && uv lock |
| WebSocket connection refused | Server must be running (uvicorn server.app:app --port 8000) |
HTTP /step returns fresh state (not continuing episode) |
HTTP endpoints are stateless. Use WebSocket /ws for multi-step episodes |
Docker build fails on COPY |
Run docker build from the project root (not from server/) |
| Docker healthcheck failing | Ensure curl is installed in the image (the Dockerfile does this) |
inference.py error: "ENV_URL required" |
export ENV_URL=http://localhost:8000 |
| Azure OpenAI 401 / auth error | Verify AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY, and that MODEL_NAME matches your deployment name |
| HF Space shows "Building" forever | Check the Logs tab for build errors. Common: missing files in COPY |
| HF Space returns 502 | The app hasn't started yet (cold start) or crashed. Check runtime logs |
| Task score is 0.0 | The resolution keywords didn't match. Check grading criteria in HACKATHON_PLAN.md Β§5 |
websockets not installed |
pip install websockets |
Quick Reference
# ββ Local development ββ
pip install -e ".[dev,inference]"
uvicorn server.app:app --port 8000 # start server
python test_local.py # validate all 3 tasks
openenv validate # check spec compliance
# ββ Docker ββ
docker build -t sentinel-env -f server/Dockerfile .
docker run -p 8000:8000 sentinel-env
# ββ Inference (HF router β hackathon default) ββ
export ENV_URL=http://localhost:8000
export HF_TOKEN=hf_...
python inference.py
# ββ Inference (OpenAI) ββ
export ENV_URL=http://localhost:8000
export API_BASE_URL=https://api.openai.com/v1
export MODEL_NAME=gpt-4o
export API_KEY=sk-...
python inference.py
# ββ Inference (Azure OpenAI) ββ
export ENV_URL=http://localhost:8000
export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
export AZURE_OPENAI_API_KEY=your-azure-key
export MODEL_NAME=your-deployment-name
python inference.py
# ββ Deploy ββ
huggingface-cli login
openenv push