AdaReasoner
/

AdaReasoner-TC-7B-Randomized

Safetensors

qwen2_5_vl

Model card Files Files and versions

xet

Community

Add metadata and link to paper

by nielsr HF Staff - opened Jan 29

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+21

-19

Files changed (1) hide show

README.md +21 -19

README.md CHANGED Viewed

@@ -1,8 +1,20 @@
 <div align="center">
-  <img src="docs/logo.png" alt="Logo" width="300">
   <h1 align="center">Dynamic Tool Orchestration for Iterative Visual Reasoning</h1>
-  <a href="#">
     <img src="https://img.shields.io/badge/Paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white" alt="Paper">
   </a>
   <a href="https://github.com/ssmisya/AdaReasoner/tree/main/docs">
@@ -24,6 +36,7 @@
 </div>
 ## 🔔 Important Note on Model Status
@@ -38,26 +51,15 @@ For RL fine-tuned version, please refer to [Data & models](https://github.com/ss
 **AdaReasoner-TC** series are trained through TC (Tool Cold Start) supervised fine-tuning only, without subsequent RL fine-tuning.
-We provide three variants of AdaReasoner-TC-7B, each optimized for different use cases:
-| Model | Description | Hugging Face |
-|------|-------------|--------------|
-| **AdaReasoner-TC-7B-Randomized** | Trained with the *adaptive learning* method, enabling strong generalization to **unseen tools and tasks**. Designed for open-ended and evolving tool environments where adaptability is required. | [🤗 Link](https://huggingface.co/AdaReasoner/AdaReasoner-TC-7B-Randomized) |
-| **AdaReasoner-TC-7B-Non-Randomized** | Trained **without adaptive learning**, providing **more stable and reliable performance on known tools and tasks**, but limited generalization to unseen tools or task settings. | [🤗 Link](https://huggingface.co/AdaReasoner/AdaReasoner-TC-7B-Non-Randomized) |
-**Key Differences:**
-- **Randomized**: Trained with adaptive learning method, enabling zero-shot generalization to novel tools and task configurations
-- **Non-Randomized**: Trained without adaptive learning, offering more predictable behavior on familiar tools but lacking generalization
 ## 📊 Performance
-Please refer to our paper for detailed benchmark results across multiple visual reasoning tasks.
 ## 📚 Citation
@@ -82,4 +84,4 @@ This model is part of the AdaReasoner project. For more information, visit our [
 ## 📧 Contact
-For questions and feedback, please open an issue in our [GitHub repository](https://github.com/ssmisya/AdaReasoner).

+---
+license: apache-2.0
+library_name: transformers
+pipeline_tag: image-text-to-text
+base_model: Qwen/Qwen2.5-VL-7B-Instruct
+tags:
+- visual-reasoning
+- tool-use
+- iterative-reasoning
+- grpo
+---
 <div align="center">
+  <img src="https://huggingface.co/AdaReasoner/AdaReasoner-TC-7B-Randomized/resolve/main/docs/logo.png" alt="Logo" width="300">
   <h1 align="center">Dynamic Tool Orchestration for Iterative Visual Reasoning</h1>
+  <a href="https://huggingface.co/papers/2601.18631">
     <img src="https://img.shields.io/badge/Paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white" alt="Paper">
   </a>
   <a href="https://github.com/ssmisya/AdaReasoner/tree/main/docs">
 </div>
+This repository contains **AdaReasoner-TC-7B-Randomized**, a variant of the model presented in [AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning](https://huggingface.co/papers/2601.18631).
 ## 🔔 Important Note on Model Status
 **AdaReasoner-TC** series are trained through TC (Tool Cold Start) supervised fine-tuning only, without subsequent RL fine-tuning.
+This specific variant, **AdaReasoner-TC-7B-Randomized**, is trained with the *adaptive learning* method, enabling strong generalization to **unseen tools and tasks**. It is designed for open-ended and evolving tool environments where adaptability is required.
+**Key Differences between TC variants:**
+- **Randomized**: Trained with adaptive learning method, enabling zero-shot generalization to novel tools and task configurations.
+- **Non-Randomized**: Trained without adaptive learning, offering more predictable behavior on familiar tools but lacking generalization.
 ## 📊 Performance
+Please refer to our paper for detailed benchmark results across multiple visual reasoning tasks. AdaReasoner improves the 7B base model by +24.9% on average and surpasses strong proprietary systems such as GPT-5 on multiple tasks, including VSP and Jigsaw.
 ## 📚 Citation
 ## 📧 Contact
+For questions and feedback, please open an issue in our [GitHub repository](https://github.com/ssmisya/AdaReasoner).