File size: 1,155 Bytes
5c4c4a7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | ---
license: apache-2.0
language:
- en
tags:
- video-language-model
- long-video-understanding
- reinforcement-learning
- self-correction
- reflection
- qwen2.5-vl
---
# Reflect-R1
Model checkpoints for **Reflect-R1: Evidence-Driven Reflection for Self-Correction in Long Video Understanding**.
- Paper: https://arxiv.org/abs/2606.27922
- Code: https://github.com/ShuimuChen-hyq/Reflect-R1
- Data: https://huggingface.co/datasets/CSDDSFSFSAFSAF/Reflect-R1-data
## Checkpoints
```text
Reflect-R1-SFT-6000/ Cold-start SFT checkpoint.
Reflect-R1-GRPO-Final/ Final SD-GRPO checkpoint.
```
Both checkpoints are based on Qwen2.5-VL-7B and include sharded `safetensors` weights together with the corresponding tokenizer and processor configuration files.
## Citation
```bibtex
@article{chen2026reflectr1,
title = {Reflect-R1: Evidence-Driven Reflection for Self-Correction in Long Video Understanding},
author = {Shuimu Chen and Yuteng Chen and Yuanshen Guan and Zebang Cheng and Zeyu Zhang and Shengqian Qin and Bin Xia and Jiaran Li and Wenming Yang and Fei Ma},
journal = {arXiv preprint arXiv:2606.27922},
year = {2026}
}
```
|