cccat6
/

FlowMo-WM

visual-world-model

model-based-control

surface-vehicle

Model card Files Files and versions

FlowMo-WM / experiments /README.md

cccat6's picture

Clean public repository for reproducibility

8e384df verified 7 days ago

|

history blame contribute delete

3.46 kB

	# Experiments

	This directory contains the paper-facing experiment code, checkpoints, results, GIFs, tables, and reports.

	This directory contains two formal experiment categories:

	- A. Learned world models: trainable image-input WMs evaluated on rollout prediction and WM-based planning.
	- B. Traditional non-WM controllers: hand-designed control baselines evaluated on the same downstream tasks.

	Main method:

	- FlowMo: Flow-Momentum World Model, the proposed drift-aware world model for surface vehicles.

	Category A learned WM comparisons:

	- LeWorldModel: JEPA-style latent predictor under the shared clean-image protocol.
	- PlaNet RSSM: recurrent state-space world-model baseline under the shared clean-image protocol.
	- TD-MPC2 Dynamics: task-oriented latent dynamics baseline under the shared clean-image protocol.

	Purpose of Category A: compare world-model architectures under identical image data, optimizer budget, rollout target, and evaluation protocol.

	Category B traditional controllers:

	- PID/LOS controller
	- No-Flow LOS Controller
	- Current-Estimator LOS Controller
	- Oracle-Flow LOS Controller

	Purpose of Category B: compare downstream task behavior against non-neural controllers that do not train a world model.

	Baseline details are documented in `BASELINES.md`; the full experiment protocol is documented in `docs/EXPERIMENT_PROTOCOL.md`.

	Design principles:

	- Shared simulator, datasets, planning utilities, metrics, and visualization live in `shared/`.
	- Each method has its own directory with `src/`, `checkpoint/`, and `result/`.
	- Paper artifacts are collected under `reports/`.
	- Method names should be explicit and readable. Avoid cryptic suffixes in paper-facing file names.

	Standard method interface:

	```text
	src/model.py # build_model(), load_model()
	src/train.py # train(config)
	src/predict.py # rollout(model, batch)
	src/config.py # default_config()
	```

	Closed-loop planning for learned world models is implemented once in `evaluate_image_planning.py` so every learned method is evaluated through the same CEM interface.

	Traditional controllers use:

	```text
	src/controller.py or src/mpc.py
	src/evaluate.py
	src/config.py
	```

	Formal clean-image configuration:

	```text
	image_size=160
	visual_scale=2.5
	train=data/paper/train.npz
	test=data/paper/test.npz
	flow_families=noflow, uniform, vortex_center, double_gyre, source_sink, source_sink_pair, gradient, shear, turbulent_patch, random_fourier
	```

	Full paper-facing image pipeline:

	```bash
	python -m experiments.run_paper_image_pipeline
	```

	The default command runs the paper configuration end to end: train all learned world models, evaluate long rollout prediction, run FlowMo latent probes, evaluate closed-loop planning against traditional controllers, generate GIFs, and write the final report. Images are rendered online from simulator states, so no separate image-cache preparation step is required.
	All flow fields are static. Localized flow structures are sampled near task routes so that boat trajectories encounter non-uniform current in the shared train/test/final protocol.

	Manual image training:

	```bash
	python -m experiments.train_image_world_models
	python -m experiments.evaluate_image_world_models
	python -m experiments.evaluate_flowmo_latent_probes
	python -m experiments.evaluate_image_planning --task reach_target --boat twin
	python -m experiments.summarize_paper_image_results
	```