| # Baseline Scope |
|
|
| The public benchmark has two formal comparison groups. |
|
|
| ## A. Learned World Models |
|
|
| Purpose: compare image-input world-model architectures under the same data, optimizer budget, rollout target, and planning interface. |
|
|
| | Directory | Report Name | Why It Is Included | |
| |---|---|---| |
| | `flowmo` | FlowMo | Proposed flow-momentum WM. Separates short object motion state from long ambient drift context. | |
| | `leworldmodel` | LeWorldModel | Simple JEPA-style latent prediction baseline. Tests whether current-image latent dynamics are enough. | |
| | `planet` | PlaNet RSSM | Recurrent state-space baseline. Tests whether generic recurrent memory can absorb momentum and drift. | |
| | `tdmpc2` | TD-MPC2 Dynamics | Compact latent-dynamics baseline. Tests action-conditioned latent rollout with a task-oriented architecture. | |
|
|
| Comparison outputs: |
|
|
| ```text |
| rollout prediction error |
| heading prediction error |
| context ablation for FlowMo |
| planning metrics when the learned WM is used inside the shared planner |
| ``` |
|
|
| ## B. Traditional Non-WM Controllers |
|
|
| Purpose: compare downstream behavior against hand-designed controllers that do not train a neural world model. |
|
|
| | Directory | Report Name | Why It Is Included | |
| |---|---|---| |
| | `pid_los_controller` | PID/LOS controller | Simple classical waypoint tracking baseline. | |
| | `no_flow_los_controller` | No-Flow LOS Controller | Geometric line-of-sight controller that ignores ambient current. | |
| | `current_estimator_los_controller` | Current-Estimator LOS Controller | Strong classical baseline that estimates current from recent drift. | |
| | `oracle_flow_los_controller` | Oracle-Flow LOS Controller | Geometric line-of-sight controller with true local flow feed-forward. | |
|
|
| Comparison outputs: |
|
|
| ```text |
| success rate |
| final distance |
| trajectory length over successful episodes |
| control effort (`sum_t ||a_t||_2^2`) over successful episodes |
| time to goal over successful episodes |
| ``` |
|
|