Flow Map Policies — Pretrained FMQ Checkpoints

Pretrained checkpoints for "Aligning Flow Map Policies with Optimal Q-Guidance".

Paper: arXiv:2605.12416
Code: github.com/christoszi/flow-map-policies

Model Description

These are Flow Map Q-Guidance (FMQ) agents trained with offline-to-online RL. Each checkpoint contains a flow map policy fine-tuned online for 1M steps using critic-guided trust-region optimization.

Checkpoints

12 environments x 5 random seeds = 60 checkpoints total.

Folder Environment Benchmark
checkpoints/ctrp4/ cube-triple-play-singletask-task4-v0 OGBench
checkpoints/ctrp3/ cube-triple-play-singletask-task3-v0 OGBench
checkpoints/cdp4/ cube-double-play-singletask-task4-v0 OGBench
checkpoints/cdp3/ cube-double-play-singletask-task3-v0 OGBench
checkpoints/sc4/ scene-play-singletask-task4-v0 OGBench
checkpoints/sc5/ scene-play-singletask-task5-v0 OGBench
checkpoints/ag4/ antmaze-giant-navigate-singletask-task4-v0 OGBench
checkpoints/ag5/ antmaze-giant-navigate-singletask-task5-v0 OGBench
checkpoints/hm3/ humanoidmaze-medium-navigate-singletask-task3-v0 OGBench
checkpoints/hm4/ humanoidmaze-medium-navigate-singletask-task4-v0 OGBench
checkpoints/can/ can-mh-low_dim RoboMimic
checkpoints/square/ square-mh-low_dim RoboMimic

Usage

pip install huggingface_hub
python -c "from huggingface_hub import snapshot_download; snapshot_download('christoszi/flow-map-policies', local_dir='.')"

Then evaluate:

python main.py --config configs/config.yaml \
  --eval_only --fmq_online \
  --restore_path=checkpoints/ctrp4/params_online_sd000.pkl \
  --env_name=cube-triple-play-singletask-task4-v0 --seed=0

Citation

@article{ziakas2026fmq,
  title={Aligning Flow Map Policies with Optimal Q-Guidance},
  author={Ziakas, Christos and Russo, Alessandra and Bose, Avishek Joey},
  journal={arXiv preprint arXiv:2605.12416},
  year={2026},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Paper for christoszi/flow-map-policies