CSDDSFSFSAFSAF
/

Reflect-R1

Reinforcement Learning

video-language-model

long-video-understanding

self-correction

Model card Files Files and versions

Reflect-R1 / README.md

CSDDSFSFSAFSAF's picture

Add model card

5c4c4a7 verified 1 day ago

|

History Blame Contribute Delete

1.16 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- video-language-model
	- long-video-understanding
	- reinforcement-learning
	- self-correction
	- reflection
	- qwen2.5-vl
	---

	# Reflect-R1

	Model checkpoints for Reflect-R1: Evidence-Driven Reflection for Self-Correction in Long Video Understanding.

	- Paper: https://arxiv.org/abs/2606.27922
	- Code: https://github.com/ShuimuChen-hyq/Reflect-R1
	- Data: https://huggingface.co/datasets/CSDDSFSFSAFSAF/Reflect-R1-data

	## Checkpoints

	```text
	Reflect-R1-SFT-6000/ Cold-start SFT checkpoint.
	Reflect-R1-GRPO-Final/ Final SD-GRPO checkpoint.
	```

	Both checkpoints are based on Qwen2.5-VL-7B and include sharded `safetensors` weights together with the corresponding tokenizer and processor configuration files.

	## Citation

	```bibtex
	@article{chen2026reflectr1,
	title = {Reflect-R1: Evidence-Driven Reflection for Self-Correction in Long Video Understanding},
	author = {Shuimu Chen and Yuteng Chen and Yuanshen Guan and Zebang Cheng and Zeyu Zhang and Shengqian Qin and Bin Xia and Jiaran Li and Wenming Yang and Fei Ma},
	journal = {arXiv preprint arXiv:2606.27922},
	year = {2026}
	}
	```