nielsr's picture
nielsr HF Staff
Add metadata and link to paper
1e0b23d verified
|
raw
history blame
4.09 kB
metadata
license: apache-2.0
library_name: transformers
pipeline_tag: image-text-to-text
base_model: Qwen/Qwen2.5-VL-7B-Instruct
tags:
  - visual-reasoning
  - tool-use
  - iterative-reasoning
  - grpo
Logo

Dynamic Tool Orchestration for Iterative Visual Reasoning

Paper Docs Data & Model Homepage Demo Video

This repository contains AdaReasoner-TC-7B-Randomized, a variant of the model presented in AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning.

πŸ”” Important Note on Model Status

The models released on this page belong to the AdaReasoner-TC series and are not the final RL-fine-tuned models. They are trained using Tool Cold Start (TC) supervised fine-tuning only, and are intended for analysis, ablation, and reproducibility purposes.

For RL fine-tuned version, please refer to Data & models

πŸ“‹ Model Description

AdaReasoner-7B is a vision-language model trained with dynamic tool orchestration capabilities for iterative visual reasoning.

AdaReasoner-TC series are trained through TC (Tool Cold Start) supervised fine-tuning only, without subsequent RL fine-tuning.

This specific variant, AdaReasoner-TC-7B-Randomized, is trained with the adaptive learning method, enabling strong generalization to unseen tools and tasks. It is designed for open-ended and evolving tool environments where adaptability is required.

Key Differences between TC variants:

  • Randomized: Trained with adaptive learning method, enabling zero-shot generalization to novel tools and task configurations.
  • Non-Randomized: Trained without adaptive learning, offering more predictable behavior on familiar tools but lacking generalization.

πŸ“Š Performance

Please refer to our paper for detailed benchmark results across multiple visual reasoning tasks. AdaReasoner improves the 7B base model by +24.9% on average and surpasses strong proprietary systems such as GPT-5 on multiple tasks, including VSP and Jigsaw.

πŸ“š Citation

If you use this model in your research, please cite:

@article{song2026adareasoner,
  title={AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning},
  author={Song, Mingyang and Sun, Haoyu and Gu, Jiawei and Li, Linjie and Xu, Luxin and Krishna, Ranjay and Cheng, Yu},
  journal={arXiv preprint arXiv:2601.18631},
  year={2026}
}

πŸ“„ License

Apache 2.0

🀝 Acknowledgments

This model is part of the AdaReasoner project. For more information, visit our GitHub repository.

πŸ“§ Contact

For questions and feedback, please open an issue in our GitHub repository.