Visual Question Answering
Transformers
Safetensors
English
qwen2_5_vl
image-text-to-text
multimodal
text-generation-inference
Instructions to use infly/INFRL-Qwen2.5-VL-72B-Preview with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use infly/INFRL-Qwen2.5-VL-72B-Preview with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("visual-question-answering", model="infly/INFRL-Qwen2.5-VL-72B-Preview")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("infly/INFRL-Qwen2.5-VL-72B-Preview") model = AutoModelForImageTextToText.from_pretrained("infly/INFRL-Qwen2.5-VL-72B-Preview") - Notebooks
- Google Colab
- Kaggle
INFRL-Qwen2.5-VL-72B-Preview
Model Overview
INFRL-Qwen2.5-VL-72B-Preview improves visual reasoning upon Qwen2.5-VL-72B-Instruct model.
As of March 25th, 2025, INFRL-Qwen2.5-VL-72B-Preview is the best-performing open-sourced VL model on various visual reasoning benchmarks (MathVision,MathVista, MathVerse).
Evaluation
| Models | MathVision (test) | MathVista (testmini) | MathVerse (testmini) |
|---|---|---|---|
| GPT4o | 30.6 | 60 | 41.2 |
| Gemini-2.0-Flash | 41.3 | 70.1 | 50.6 |
| Claude 3.5 Sonnet | 33.5 | 67.7 | 47.8 |
| QvQ-72B | 35.9 | 71.4 | 48.6 |
| InternVL2.5-78B | 34.9 | 72.3 | 51.7 |
| Qwen-VL-2.5-72B | 38.1 | 74.8 | 57.18 |
| INFRL-VL-Preview | 41.9 | 77.8 | 58.84 |
We will release a code repository for VLM evaluation. It supports RL training with simple rule-based rewards, meanwhile aligning with LLM-Judge results.
Stay tuned!
Contributors
Supervisors
Wei Chu • Yuan Qi
VL Team
Haozhe Wang • Zuming Huang
RL Team
Haozhe Wang • Chao Qu • Long Li
Thanks
Thanks to Jiaran Hao, Liuyihan Song for supports in the RL infrastructure.
Citation
If you find our model useful, please consider citing:
@misc {INFRL_VL_Preview,
author = { {Wang, Haozhe and Huang, Zuming and Qu, Chao and Chu, Wei and Qi, Yuan} },
title = { INFRL-Qwen2.5-VL-72B-Preview },
year = 2025,
url = { https://huggingface.co/infly/INFRL-Qwen2.5-VL-72B-Preview},
publisher = { Hugging Face }
}
- Downloads last month
- 12
Model tree for infly/INFRL-Qwen2.5-VL-72B-Preview
Base model
Qwen/Qwen2.5-VL-72B-Instruct