File size: 1,837 Bytes
ad6248e 8b5e393 ad6248e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | ---
title: Meta-SRE
emoji: π§
colorFrom: blue
colorTo: red
sdk: docker
pinned: false
license: mit
short_description: OpenEnv benchmark for SRE incident debugging
---
# Meta-SRE OpenEnv Benchmark
A live simulation environment for training and evaluating LLM agents as Senior Site Reliability Engineers at Meta.
## Connect with openenv_client
```python
import openenv_client
env = openenv_client.connect("huggingface.co/spaces/Anvit25/Meta-SRE")
obs = env.reset(task_id=1)
done = False
while not done:
action = your_agent.decide(obs) # {"tool": ..., "params": {...}}
obs, reward, done, info = env.step(action)
score = env.grade()
print(f"Score: {score['normalized_score']:.3f}")
```
## Direct API
```python
import requests
BASE = "https://anvit25-meta-sre.hf.space"
obs = requests.post(f"{BASE}/reset", json={"task_id": 1}).json()
done = False
while not done:
action = your_agent.decide(obs)
result = requests.post(f"{BASE}/step", json=action).json()
obs = result["observation"]
done = result["done"]
score = requests.get(f"{BASE}/grade").json()["normalized_score"]
print(f"Score: {score:.3f}")
```
## Tasks
| ID | Difficulty | Description |
|----|-----------|-------------|
| 1 | Easy | AttributeError β hallucinated dict method in ad_ranking |
| 2 | Medium | Silent timestamp corruption (CAPI β ROAS degradation) |
| 3 | Medium-Hard | DB connection pool exhaustion under load |
| 4 | Hard | Circular FK migration cascading across services |
| 5 | Hard | PII data exposure via DEBUG_MODE=True |
## Endpoints
- `POST /reset` β start episode (`{"task_id": 1-5}`)
- `POST /step` β take action (`{"tool": "...", "params": {...}}`)
- `GET /state` β current observation
- `GET /grade` β episode score
- `GET /tools` β available tools list
- `GET /health` β health check
|