metadata
tags:
- discrete-flow-matching
- web-action-planning
- formfactory
- reinforcement-learning
- openbrowser
license: apache-2.0
FlowVFE-39M-FlowGRPO
39M-parameter FlowVFE model trained with Flow-GRPO (online RL with browser execution) on FormFactory web form-filling tasks. Built on the FlowVFE-39M-SFT checkpoint. Part of the STAD80 project: Generative Action Planning via Discrete Flow Matching.
Paper
Concentrate or Collapse: When Reinforcement Learning Meets Diffusion Language Models for Web Planning
- Author: Muhammad Enrizky Brillian
- Institution: University of Toronto Scarborough
Training Details
- Dataset: FormFactory (992 train / 124 val / 124 test tasks, 25 form types, 8 domains)
- Infrastructure: Single NVIDIA A10G GPU (24GB VRAM) on Anyscale
- Framework: PyTorch + PEFT (LoRA/QLoRA)
Citation
If you use this model, please cite:
@article{brillian2026flowgrpo,
title={Concentrate or Collapse: When Reinforcement Learning Meets Diffusion Language Models for Web Planning},
author={Brillian, Muhammad Enrizky},
year={2026}
}
This model was trained and evaluated on the FormFactory benchmark:
@misc{li2025formfactory,
title={FormFactory: An Interactive Benchmarking Suite for Multimodal Form-Filling Agents},
author={Bobo Li and Yuheng Wang and Hao Fei and Juncheng Li and Wei Ji and Mong-Li Lee and Wynne Hsu},
year={2025},
eprint={2506.01520},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2506.01520}
}