| base_model: | |
| - Qwen/Qwen2.5-VL-7B-Instruct | |
| datasets: | |
| - TIGER-Lab/PixelReasoner-SFT-Data | |
| language: | |
| - en | |
| library_name: transformers | |
| license: apache-2.0 | |
| metrics: | |
| - accuracy | |
| pipeline_tag: image-text-to-text | |
| The model is trained with curiosity-driven RL described in [paper](https://arxiv.org/abs/2505.15966). | |
| We have released vllm based inference code at https://github.com/TIGER-AI-Lab/Pixel-Reasoner/. | |
| Project page: https://tiger-ai-lab.github.io/Pixel-Reasoner/ | |
| Github repository: https://github.com/TIGER-AI-Lab/Pixel-Reasoner/ | |
| We will release a simple hf.generate() based inference code. | |
| Please also play with the cool [interactive demo](https://huggingface.co/spaces/TIGER-Lab/Pixel-Reasoner) |