FlowMo-WM / experiments /README.md
cccat6's picture
Add files using upload-large-folder tool
db21b01 verified
|
raw
history blame
3.24 kB

Experiments

This directory contains the paper-facing experiment code, checkpoints, results, figures, GIFs, tables, and reports.

This directory contains two formal experiment categories:

  • A. Learned world models: trainable image-input WMs evaluated on rollout prediction and WM-based planning.
  • B. Traditional non-WM controllers: hand-designed control baselines evaluated on the same downstream tasks.

Main method:

  • FlowMo: Flow-Momentum World Model, the proposed drift-aware world model for surface vehicles.

Category A learned WM comparisons:

  • LeWorldModel: JEPA-style latent predictor under the shared clean-image protocol.
  • PlaNet RSSM: recurrent state-space world-model baseline under the shared clean-image protocol.
  • TD-MPC2 Dynamics: task-oriented latent dynamics baseline under the shared clean-image protocol.

Purpose of Category A: compare world-model architectures under identical image data, optimizer budget, parameter budget, rollout target, and evaluation protocol.

Category B traditional controllers:

  • PID/LOS controller
  • Physics MPC No-Flow
  • Current-Estimator MPC
  • Oracle-Flow MPC

Purpose of Category B: compare downstream task behavior against non-neural controllers that do not train a world model.

Baseline details are documented in BASELINES.md; the full experiment protocol is documented in docs/EXPERIMENT_PROTOCOL.md.

Design principles:

  • Shared simulator, datasets, planning utilities, metrics, and visualization live in shared/.
  • Each method has its own directory with src/, checkpoint/, and result/.
  • Paper artifacts are collected in top-level figures/, gifs/, tables/, and reports/.
  • Method names should be explicit and readable. Avoid cryptic suffixes in paper-facing file names.

Standard method interface:

src/model.py      # build_model(), load_model()
src/train.py      # train(config)
src/predict.py    # rollout(model, batch)
src/config.py     # default_config()

Closed-loop planning for learned world models is implemented once in evaluate_image_planning.py so every learned method is evaluated through the same CEM interface.

Traditional controllers use:

src/controller.py or src/mpc.py
src/evaluate.py
src/config.py

Formal clean-image configuration:

image_size=160
visual_scale=2.5
train=data/paper/train.npz
test=data/paper/test_unseen_flow.npz and data/paper/test_unseen_boat_params.npz

Full paper-facing image pipeline:

python -m experiments.run_paper_image_pipeline

The default command runs the paper configuration end to end: train all learned world models, evaluate long rollout prediction, run FlowMo latent probes, evaluate closed-loop planning against traditional controllers, generate GIFs, and write the final report. Images are rendered online from simulator states, so no separate image-cache preparation step is required.

Manual image training:

python -m experiments.train_image_world_models
python -m experiments.evaluate_image_world_models
python -m experiments.evaluate_flowmo_latent_probes
python -m experiments.evaluate_image_planning --task reach_uniform --boat twin
python -m experiments.summarize_paper_image_results