Upload FlowVFE-39M-FlowGRPO checkpoint
Browse files
README.md
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
tags:
|
| 3 |
+
- discrete-flow-matching
|
| 4 |
+
- web-action-planning
|
| 5 |
+
- formfactory
|
| 6 |
+
- reinforcement-learning
|
| 7 |
+
- openbrowser
|
| 8 |
+
license: apache-2.0
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# FlowVFE-39M-FlowGRPO
|
| 12 |
+
|
| 13 |
+
39M-parameter FlowVFE model trained with Flow-GRPO (online RL with browser execution) on FormFactory web form-filling tasks. Built on the FlowVFE-39M-SFT checkpoint. Part of the STAD80 project: Generative Action Planning via Discrete Flow Matching.
|
| 14 |
+
|
| 15 |
+
## Paper
|
| 16 |
+
|
| 17 |
+
**Generative Action Planning via Discrete Flow Matching with Online Reinforcement Fine-Tuning**
|
| 18 |
+
- Authors: Muhammad Enrizky Brillian, Qiang Sun
|
| 19 |
+
- Institution: University of Toronto Scarborough
|
| 20 |
+
|
| 21 |
+
## Training Details
|
| 22 |
+
|
| 23 |
+
- **Dataset**: FormFactory (992 train / 124 val / 124 test tasks, 25 form types, 8 domains)
|
| 24 |
+
- **Infrastructure**: Single NVIDIA A10G GPU (24GB VRAM) on Anyscale
|
| 25 |
+
- **Framework**: PyTorch + PEFT (LoRA/QLoRA)
|
| 26 |
+
|
| 27 |
+
## Citation
|
| 28 |
+
|
| 29 |
+
If you use this model, please cite:
|
| 30 |
+
|
| 31 |
+
```bibtex
|
| 32 |
+
@article{brillian2026flowgrpo,
|
| 33 |
+
title={Generative Action Planning via Discrete Flow Matching with Online Reinforcement Fine-Tuning},
|
| 34 |
+
author={Brillian, Muhammad Enrizky and Sun, Qiang},
|
| 35 |
+
year={2026}
|
| 36 |
+
}
|
| 37 |
+
```
|
model.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:abcb8a26366842d74af13be0f828e9af488fc7d13e40a7fef6ba4eeb5b7f7f04
|
| 3 |
+
size 122450315
|