billyenrizky
/

FlowVFE-39M-FlowGRPO

Reinforcement Learning

discrete-flow-matching

web-action-planning

Model card Files Files and versions

billyenrizky commited on Mar 25

Commit

9864595

·

verified ·

1 Parent(s): e6fc3fd

Upload FlowVFE-39M-FlowGRPO checkpoint

Files changed (2) hide show

README.md +37 -0
model.pt +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,37 @@

+---
+tags:
+  - discrete-flow-matching
+  - web-action-planning
+  - formfactory
+  - reinforcement-learning
+  - openbrowser
+license: apache-2.0
+---
+# FlowVFE-39M-FlowGRPO
+39M-parameter FlowVFE model trained with Flow-GRPO (online RL with browser execution) on FormFactory web form-filling tasks. Built on the FlowVFE-39M-SFT checkpoint. Part of the STAD80 project: Generative Action Planning via Discrete Flow Matching.
+## Paper
+**Generative Action Planning via Discrete Flow Matching with Online Reinforcement Fine-Tuning**
+- Authors: Muhammad Enrizky Brillian, Qiang Sun
+- Institution: University of Toronto Scarborough
+## Training Details
+- **Dataset**: FormFactory (992 train / 124 val / 124 test tasks, 25 form types, 8 domains)
+- **Infrastructure**: Single NVIDIA A10G GPU (24GB VRAM) on Anyscale
+- **Framework**: PyTorch + PEFT (LoRA/QLoRA)
+## Citation
+If you use this model, please cite:
+```bibtex
+@article{brillian2026flowgrpo,
+  title={Generative Action Planning via Discrete Flow Matching with Online Reinforcement Fine-Tuning},
+  author={Brillian, Muhammad Enrizky and Sun, Qiang},
+  year={2026}
+}
+```

model.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:abcb8a26366842d74af13be0f828e9af488fc7d13e40a7fef6ba4eeb5b7f7f04
+size 122450315