TIGER-Lab
/

VL-Reasoner-72B

Visual Question Answering

image-text-to-text

text-generation-inference

Model card Files Files and versions

JasperHaozhe commited on Apr 2, 2025

Commit

392ad69

·

verified ·

1 Parent(s): 4ca9dbb

Create README.md

Files changed (1) hide show

README.md +37 -0

README.md ADDED Viewed

	@@ -0,0 +1,37 @@

+---
+base_model:
+- Qwen/Qwen2.5-VL-72B-Instruct
+language:
+- en
+license: apache-2.0
+tags:
+- transformers
+- multimodal
+pipeline_tag: visual-question-answering
+---
+# VL-Rethinker-72B-Preview
+## Model Overview
+- **VL-Rethinker-72B-Preview** improves visual reasoning upon [Qwen2.5-VL-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct) model.
+- As of April 3rd, 2025, **VL-Rethinker-72B-Preview** achieves superior results on various visual reasoning benchmarks ([MathVision](https://mathllm.github.io/mathvision/),[MathVista](https://mathvista.github.io/),  [MathVerse](https://mathverse-cuhk.github.io/), [MMMU-Pro](), [EMMA](), [MEGA]()).
+## Evaluation
+We will release a code repository for VLM evaluation. It supports RL training with simple rule-based rewards, meanwhile aligning with LLM-Judge results.
+Stay tuned!
+## Citation
+If you find our model useful, please consider citing:
+```
+@misc {VL-Rethinker-72B-Preview,
+	author       = { Wang, Haozhe and Lin, Fangzhen and Chen, Wuhu },
+	title        = { VL-Rethinker-72B-Preview },
+	year         = 2025,
+	url          = { https://huggingface.co/TIGER-Lab/VL-Rethinker-Preview},
+	publisher    = { Hugging Face }
+}
+```