jingyiZ00
/

R1-VL-2B

Image-Text-to-Text

text-generation-inference

Model card Files Files and versions

jingyiZ00 commited on Mar 21, 2025

Commit

85b9b9e

·

verified ·

1 Parent(s): daae4f3

Create README.md

Files changed (1) hide show

README.md +15 -0

README.md ADDED Viewed

	@@ -0,0 +1,15 @@

+---
+license: apache-2.0
+datasets:
+- HuanjinYao/Mulberry-SFT
+base_model:
+- Qwen/Qwen2-VL-2B-Instruct
+pipeline_tag: image-text-to-text
+library_name: transformers
+---
+# R1-VL-2B
+R1-VL-2B is a reasoning model trained with step-wise group relative policy optimization (StepGRPO).
+### Paper: https://arxiv.org/pdf/2503.12937
+### Github: https://github.com/jingyi0000/R1-VL
+### Base model: https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct