File size: 2,884 Bytes
c3a86a2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
# HTTP API

The env runs on port 7860 by default. Everything is JSON in/out. If you want
the exact field types, just look at `models.py` β€” it's the source of truth and
it's short.

## Health

`GET /health` returns `{"status": "ok"}`. Use it for readiness probes.

## Listing tasks

`GET /tasks` gives you back the list of tasks the env knows about:

```json
{
  "tasks": [
    { "id": "task_easy", "description": "Single traffic spike...",
      "episode_length": 30, "difficulty": "easy" }
  ]
}
```

There are three of these out of the box (easy / medium / hard). Adding more
is covered in `TASK_AUTHORING.md`.

## Starting an episode

`POST /reset` starts (or restarts) an episode. The body is just a task id,
plus an optional config block if you want to override the server model:

```json
{
  "task_id": "task_easy",
  "config": { "server_capacity": 100.0, "crash_load_ratio": 1.3 }
}
```

You can omit `config` entirely and you'll get the defaults from `EnvConfig`.
The response gives you the initial state, the task id you actually got, the
episode length, and the config that ended up being used:

```json
{
  "state": { "cpu_usage": 0.54, "memory_usage": 0.36, "request_rate": 40.0,
             "queue_length": 0, "avg_latency": 58.0, "step": 0, "crashed": false },
  "task_id": "task_easy",
  "max_steps": 30,
  "config": { "server_capacity": 100.0, "...": "..." }
}
```

If you pass a task id that doesn't exist you'll get a 400 back with the list
of valid ids in the error message.

## Taking a step

`POST /step` with one of four actions:

- `allow_all` β€” let everything through
- `throttle_70` β€” drop 30%
- `throttle_40` β€” drop 60%
- `drop_aggressive` β€” drop 80%

```json
{ "action": "throttle_70" }
```

You get back the next state, the reward for this step, whether the episode
is done, and an `info` dict with the raw incoming/allowed counts and a few
other things that are useful for debugging your agent:

```json
{
  "state": { "...": "..." },
  "reward": 0.41,
  "done": false,
  "info": {
    "incoming_requests": 40.0, "allowed_requests": 28.0,
    "accept_rate": 0.7, "crashed": false,
    "episode_step": 1, "max_steps": 30, "server_capacity": 100.0
  }
}
```

When the episode finishes, `done` flips to `true` and `info` will also have
`final_score` (between 0 and 1) and `episode_done: true`. Trying to step
after that point gives you a 400 β€” call `/reset` and start over.

One thing worth knowing: after each step the `state.request_rate` field is
overwritten with the *upcoming* incoming rate, not the one you just handled.
That's deliberate β€” it's a small concession to the agent so it can react
before a spike rather than after.

## Other endpoints

`GET /state` peeks at the current state without advancing the episode.
Handy for debugging or for a separate dashboard process.

`GET /openenv.yaml` serves the OpenEnv spec as plain text.