File size: 4,858 Bytes
ddbc1ba | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 | # Task Schema
**Source file:** `core/task.py`
---
## Overview
A `Task` object defines everything about one episode: the goal, the state the agent operates in, what can happen during the episode, how success is measured, and how long the agent has. `LifeStackEnv.reset()` takes a `Task` and builds the entire episode state from it.
---
## `Task` dataclass
| Field | Type | Description |
|-------|------|-------------|
| `id` | `str` | Unique task identifier |
| `domain` | `str` | Task domain: `"flight_crisis"`, `"code_merge_crisis"`, etc. |
| `goal` | `str` | Human-readable objective shown to the agent |
| `constraints` | `dict` | Budget and deadline keys, e.g. `{"budget_max": 400, "deadline_step": 18}` |
| `hidden_state` | `dict` | Full truth; agent cannot see directly (e.g. `{"card_available": True}`) |
| `mutable_world` | `dict` | Partial truth; some keys visible, some only revealed by inspect |
| `visible_world` | `dict` | Always-visible subset of `mutable_world` |
| `success_conditions` | `list[dict]` | Terminal predicates, e.g. `[{"key": "flight_rebooked", "value": True}]` |
| `failure_conditions` | `list[dict]` | Episode-ending failure predicates |
| `event_schedule` | `list[ExoEvent]` | Events that fire during the episode |
| `viable_routes` | `list[Route]` | Paths the agent can execute |
| `milestones` | `list[Milestone]` | Intermediate progress gates |
| `horizon` | `int` | Max steps (e.g. 30 for `FlightCrisisTask`, 10 for `CodeMergeCrisisTask`) |
| `difficulty` | `int` | 1β5 curriculum index |
| `domain_metadata` | `dict` | Story text and domain-specific generator hints |
---
## `Route` dataclass
A route is a structured path the agent can execute by targeting it with `action_type="execute"` (or any matching `required_action_types` entry). When a route's preconditions are met and it's executed, its `consequences` are applied to `world_state`, which can trigger success conditions.
| Field | Description |
|-------|-------------|
| `id` | Route identifier, shown to agent in prompt |
| `name` | Human-readable name |
| `required_action_types` | The model must use one of these action types to execute the route |
| `preconditions` | World/hidden state checks that must be true before the route is available |
| `consequences` | World state mutations on route completion |
| `closes_routes` | Route IDs that become unavailable after this route is taken |
| `milestones_unlocked` | Milestone IDs this route can trigger |
| `final_reward` | Bonus added on route completion |
Example from `FlightCrisisTask`:
```python
Route(
id="rebook_premium",
name="Rebook Premium Option",
required_action_types=["communicate", "execute"],
preconditions={"card_available": True},
consequences={"flight_rebooked": True},
closes_routes=["wait_lounge"],
final_reward=2.5
)
```
---
## `Milestone` dataclass
Intermediate checkpoints that reward partial progress. `LifeStackVerifier.check_new_milestones()` scans milestones every step.
| Field | Description |
|-------|-------------|
| `id` | Milestone identifier |
| `condition_key` | World/hidden key to check |
| `condition_value` | Required value |
| `reward` | Added to episode reward when milestone is hit |
---
## `ExoEvent` dataclass
World events that fire during the episode, potentially changing state and closing routes.
| Field | Description |
|-------|-------------|
| `step` | Fire at this step; -1 = probabilistic |
| `probability` | 1.0 = always fire; <1.0 = fire with this probability when step=-1 |
| `world_mutation` | Dict applied to `world_state` when event fires |
| `hidden_state_mutation` | Dict applied to `hidden_state` when event fires |
| `closes_routes` | Route IDs made unavailable after this event |
---
## Built-in task factories
`FlightCrisisTask()` β "Survive Airport Cancellation", horizon=30, difficulty=4. Two competing routes (`rebook_premium` requires `card_available=True`; `wait_lounge` requires `lounge_access=True`). Two timed events: a `price_surge` at step 5 sets `card_available=False`, and `lounge_full` at step 8 closes `wait_lounge`.
`CodeMergeCrisisTask()` β "Resolve Production Outage", horizon=10, difficulty=4. Two competing routes: revert the commit (`revert_commit`) or push a hotfix (`hotfix`). No scheduled events.
`TaskGenerator` in `core/task.py` holds only these two. The richer `TaskGenerator` in `agent/conflict_generator.py` covers all 8 task domains with template-based generation and is the one used by `scripts/train_trl.py`.
---
## Related files
- `core/lifestack_env.py` β `WorldEngine` fires `ExoEvent` objects during `step()`
- `core/verifier.py` β `LifeStackVerifier` checks success/failure/milestone conditions
- `agent/conflict_generator.py` β full `TaskGenerator` for all 8 domains
- `docs/lifestack_env.md` β how tasks integrate with the environment
|