FormFactory: An Interactive Benchmarking Suite for Multimodal Form-Filling Agents
Paper • 2506.01520 • Published
QLoRA (4-bit NF4) adapter for Qwen/Qwen3-8B, fine-tuned on the FormFactory benchmark for web form-filling.
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3-8B",
device_map="auto",
torch_dtype="auto",
)
model = PeftModel.from_pretrained(base_model, "billyenrizky/Qwen3-8B-FormFactory-GRPO-LoRA")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B")
| Split | Nonzero Rate | Avg Reward |
|---|---|---|
| Val | 100.0% | 0.670 |
| Test | 100.0% | 0.669 |
If you use this model, please cite our paper and the FormFactory benchmark:
@article{brillian2026browser,
title={Browser-in-the-Loop: Reinforcement Fine-Tuning LLM Agents for Web Form Filling},
author={Brillian, Muhammad Enrizky},
year={2026}
}
@article{li2025formfactory,
title={FormFactory: An Interactive Benchmarking Suite for Multimodal Form-Filling Agents},
author={Li, B. and Wang, Y. and Fei, H. and Li, J. and Ji, W. and Lee, M.-L. and Hsu, W.},
journal={arXiv preprint arXiv:2506.01520},
year={2025}
}