FormFactory: An Interactive Benchmarking Suite for Multimodal Form-Filling Agents
Paper • 2506.01520 • Published
ReFusion 8B (GSAI-ML) fine-tuned with QLoRA SFT on FormFactory web form-filling tasks. Uses masked diffusion with iterative unmasking on Qwen3-8B backbone. Achieves 60.5% nonzero reward rate and 0.267 average reward on 124 test tasks. Part of the STAD80 project: Generative Action Planning via Discrete Flow Matching.
Generative Action Planning via Discrete Flow Matching with Online Reinforcement Fine-Tuning
If you use this model, please cite:
@article{brillian2026flowgrpo,
title={Generative Action Planning via Discrete Flow Matching with Online Reinforcement Fine-Tuning},
author={Brillian, Muhammad Enrizky},
year={2026}
}
This model was trained and evaluated on the FormFactory benchmark:
@misc{li2025formfactory,
title={FormFactory: An Interactive Benchmarking Suite for Multimodal Form-Filling Agents},
author={Bobo Li and Yuheng Wang and Hao Fei and Juncheng Li and Wei Ji and Mong-Li Lee and Wynne Hsu},
year={2025},
eprint={2506.01520},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2506.01520}
}