Add model card metadata and links
#1
by nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,3 +1,32 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
## Citation
|
| 2 |
|
| 3 |
If you find this work useful, please cite our paper:
|
|
|
|
| 1 |
+
---
|
| 2 |
+
pipeline_tag: image-text-to-text
|
| 3 |
+
library_name: transformers
|
| 4 |
+
base_model: Qwen/Qwen2.5-VL-3B-Instruct
|
| 5 |
+
tags:
|
| 6 |
+
- progress-reasoning
|
| 7 |
+
- vlm
|
| 8 |
+
- vision-language
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# ProgressLM-3B-SFT
|
| 12 |
+
|
| 13 |
+
ProgressLM is a Vision-Language Model (VLM) specifically fine-tuned for **progress reasoning**—estimating how much of a task has been completed from partial observations. It is introduced in the paper [ProgressLM: Towards Progress Reasoning in Vision-Language Models](https://huggingface.co/papers/2601.15224).
|
| 14 |
+
|
| 15 |
+
This version is the 3B parameter model fine-tuned using Supervised Fine-Tuning (SFT) on the **ProgressLM-45K** dataset.
|
| 16 |
+
|
| 17 |
+
## Resources
|
| 18 |
+
- **Project Page:** [https://progresslm.github.io/ProgressLM/](https://progresslm.github.io/ProgressLM/)
|
| 19 |
+
- **GitHub Repository:** [https://github.com/ProgressLM/ProgressLM](https://github.com/ProgressLM/ProgressLM)
|
| 20 |
+
- **Paper:** [https://huggingface.co/papers/2601.15224](https://huggingface.co/papers/2601.15224)
|
| 21 |
+
- **Dataset:** [Raymond-Qiancx/ProgressLM-Dataset](https://huggingface.co/datasets/Raymond-Qiancx/ProgressLM-Dataset)
|
| 22 |
+
|
| 23 |
+
## Overview
|
| 24 |
+
Estimating task progress requires reasoning over long-horizon dynamics rather than recognizing static visual content. ProgressLM follows a human-inspired two-stage progress reasoning paradigm:
|
| 25 |
+
1. **Episodic Retrieval:** Coarsely locating the observation along the demonstrated task.
|
| 26 |
+
2. **Mental Simulation:** Imagining the transition from the retrieved anchor to the current observation for a fine-grained estimate.
|
| 27 |
+
|
| 28 |
+
ProgressLM-3B achieves consistent improvements in task progress estimation even at a small model scale, despite being trained on a task set fully disjoint from evaluation tasks.
|
| 29 |
+
|
| 30 |
## Citation
|
| 31 |
|
| 32 |
If you find this work useful, please cite our paper:
|