nielsr HF Staff

Add metadata and link to paper

1e0b23d verified 2 months ago

4.09 kB

license: apache-2.0
library_name: transformers
pipeline_tag: image-text-to-text
base_model: Qwen/Qwen2.5-VL-7B-Instruct
tags:
  - visual-reasoning
  - tool-use
  - iterative-reasoning
  - grpo

Dynamic Tool Orchestration for Iterative Visual Reasoning

This repository contains AdaReasoner-TC-7B-Randomized, a variant of the model presented in AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning.

🔔 Important Note on Model Status

The models released on this page belong to the AdaReasoner-TC series and are not the final RL-fine-tuned models. They are trained using Tool Cold Start (TC) supervised fine-tuning only, and are intended for analysis, ablation, and reproducibility purposes.

For RL fine-tuned version, please refer to Data & models

📋 Model Description

AdaReasoner-7B is a vision-language model trained with dynamic tool orchestration capabilities for iterative visual reasoning.

AdaReasoner-TC series are trained through TC (Tool Cold Start) supervised fine-tuning only, without subsequent RL fine-tuning.

This specific variant, AdaReasoner-TC-7B-Randomized, is trained with the adaptive learning method, enabling strong generalization to unseen tools and tasks. It is designed for open-ended and evolving tool environments where adaptability is required.

Key Differences between TC variants:

Randomized: Trained with adaptive learning method, enabling zero-shot generalization to novel tools and task configurations.
Non-Randomized: Trained without adaptive learning, offering more predictable behavior on familiar tools but lacking generalization.

📊 Performance

Please refer to our paper for detailed benchmark results across multiple visual reasoning tasks. AdaReasoner improves the 7B base model by +24.9% on average and surpasses strong proprietary systems such as GPT-5 on multiple tasks, including VSP and Jigsaw.

📚 Citation

If you use this model in your research, please cite:

@article{song2026adareasoner,
  title={AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning},
  author={Song, Mingyang and Sun, Haoyu and Gu, Jiawei and Li, Linjie and Xu, Luxin and Krishna, Ranjay and Cheng, Yu},
  journal={arXiv preprint arXiv:2601.18631},
  year={2026}
}

📄 License

Apache 2.0

🤝 Acknowledgments

This model is part of the AdaReasoner project. For more information, visit our GitHub repository.

📧 Contact

For questions and feedback, please open an issue in our GitHub repository.