FlowMo-WM / experiments /BASELINES.md
cccat6's picture
Clean public repository for reproducibility
8e384df verified

Baseline Scope

The public benchmark has two formal comparison groups.

A. Learned World Models

Purpose: compare image-input world-model architectures under the same data, optimizer budget, rollout target, and planning interface.

Directory Report Name Why It Is Included
flowmo FlowMo Proposed flow-momentum WM. Separates short object motion state from long ambient drift context.
leworldmodel LeWorldModel Simple JEPA-style latent prediction baseline. Tests whether current-image latent dynamics are enough.
planet PlaNet RSSM Recurrent state-space baseline. Tests whether generic recurrent memory can absorb momentum and drift.
tdmpc2 TD-MPC2 Dynamics Compact latent-dynamics baseline. Tests action-conditioned latent rollout with a task-oriented architecture.

Comparison outputs:

rollout prediction error
heading prediction error
context ablation for FlowMo
planning metrics when the learned WM is used inside the shared planner

B. Traditional Non-WM Controllers

Purpose: compare downstream behavior against hand-designed controllers that do not train a neural world model.

Directory Report Name Why It Is Included
pid_los_controller PID/LOS controller Simple classical waypoint tracking baseline.
no_flow_los_controller No-Flow LOS Controller Geometric line-of-sight controller that ignores ambient current.
current_estimator_los_controller Current-Estimator LOS Controller Strong classical baseline that estimates current from recent drift.
oracle_flow_los_controller Oracle-Flow LOS Controller Geometric line-of-sight controller with true local flow feed-forward.

Comparison outputs:

success rate
final distance
trajectory length over successful episodes
control effort (`sum_t ||a_t||_2^2`) over successful episodes
time to goal over successful episodes