meta-r2 / docs /task.md
github-actions[bot]
Deploy Space snapshot
ddbc1ba
# Task Schema
**Source file:** `core/task.py`
---
## Overview
A `Task` object defines everything about one episode: the goal, the state the agent operates in, what can happen during the episode, how success is measured, and how long the agent has. `LifeStackEnv.reset()` takes a `Task` and builds the entire episode state from it.
---
## `Task` dataclass
| Field | Type | Description |
|-------|------|-------------|
| `id` | `str` | Unique task identifier |
| `domain` | `str` | Task domain: `"flight_crisis"`, `"code_merge_crisis"`, etc. |
| `goal` | `str` | Human-readable objective shown to the agent |
| `constraints` | `dict` | Budget and deadline keys, e.g. `{"budget_max": 400, "deadline_step": 18}` |
| `hidden_state` | `dict` | Full truth; agent cannot see directly (e.g. `{"card_available": True}`) |
| `mutable_world` | `dict` | Partial truth; some keys visible, some only revealed by inspect |
| `visible_world` | `dict` | Always-visible subset of `mutable_world` |
| `success_conditions` | `list[dict]` | Terminal predicates, e.g. `[{"key": "flight_rebooked", "value": True}]` |
| `failure_conditions` | `list[dict]` | Episode-ending failure predicates |
| `event_schedule` | `list[ExoEvent]` | Events that fire during the episode |
| `viable_routes` | `list[Route]` | Paths the agent can execute |
| `milestones` | `list[Milestone]` | Intermediate progress gates |
| `horizon` | `int` | Max steps (e.g. 30 for `FlightCrisisTask`, 10 for `CodeMergeCrisisTask`) |
| `difficulty` | `int` | 1–5 curriculum index |
| `domain_metadata` | `dict` | Story text and domain-specific generator hints |
---
## `Route` dataclass
A route is a structured path the agent can execute by targeting it with `action_type="execute"` (or any matching `required_action_types` entry). When a route's preconditions are met and it's executed, its `consequences` are applied to `world_state`, which can trigger success conditions.
| Field | Description |
|-------|-------------|
| `id` | Route identifier, shown to agent in prompt |
| `name` | Human-readable name |
| `required_action_types` | The model must use one of these action types to execute the route |
| `preconditions` | World/hidden state checks that must be true before the route is available |
| `consequences` | World state mutations on route completion |
| `closes_routes` | Route IDs that become unavailable after this route is taken |
| `milestones_unlocked` | Milestone IDs this route can trigger |
| `final_reward` | Bonus added on route completion |
Example from `FlightCrisisTask`:
```python
Route(
id="rebook_premium",
name="Rebook Premium Option",
required_action_types=["communicate", "execute"],
preconditions={"card_available": True},
consequences={"flight_rebooked": True},
closes_routes=["wait_lounge"],
final_reward=2.5
)
```
---
## `Milestone` dataclass
Intermediate checkpoints that reward partial progress. `LifeStackVerifier.check_new_milestones()` scans milestones every step.
| Field | Description |
|-------|-------------|
| `id` | Milestone identifier |
| `condition_key` | World/hidden key to check |
| `condition_value` | Required value |
| `reward` | Added to episode reward when milestone is hit |
---
## `ExoEvent` dataclass
World events that fire during the episode, potentially changing state and closing routes.
| Field | Description |
|-------|-------------|
| `step` | Fire at this step; -1 = probabilistic |
| `probability` | 1.0 = always fire; <1.0 = fire with this probability when step=-1 |
| `world_mutation` | Dict applied to `world_state` when event fires |
| `hidden_state_mutation` | Dict applied to `hidden_state` when event fires |
| `closes_routes` | Route IDs made unavailable after this event |
---
## Built-in task factories
`FlightCrisisTask()` β€” "Survive Airport Cancellation", horizon=30, difficulty=4. Two competing routes (`rebook_premium` requires `card_available=True`; `wait_lounge` requires `lounge_access=True`). Two timed events: a `price_surge` at step 5 sets `card_available=False`, and `lounge_full` at step 8 closes `wait_lounge`.
`CodeMergeCrisisTask()` β€” "Resolve Production Outage", horizon=10, difficulty=4. Two competing routes: revert the commit (`revert_commit`) or push a hotfix (`hotfix`). No scheduled events.
`TaskGenerator` in `core/task.py` holds only these two. The richer `TaskGenerator` in `agent/conflict_generator.py` covers all 8 task domains with template-based generation and is the one used by `scripts/train_trl.py`.
---
## Related files
- `core/lifestack_env.py` β€” `WorldEngine` fires `ExoEvent` objects during `step()`
- `core/verifier.py` β€” `LifeStackVerifier` checks success/failure/milestone conditions
- `agent/conflict_generator.py` β€” full `TaskGenerator` for all 8 domains
- `docs/lifestack_env.md` β€” how tasks integrate with the environment