Add metadata and link to paper

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +21 -19
README.md CHANGED
@@ -1,8 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
  <div align="center">
2
- <img src="docs/logo.png" alt="Logo" width="300">
3
  <h1 align="center">Dynamic Tool Orchestration for Iterative Visual Reasoning</h1>
4
 
5
- <a href="#">
6
  <img src="https://img.shields.io/badge/Paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white" alt="Paper">
7
  </a>
8
  <a href="https://github.com/ssmisya/AdaReasoner/tree/main/docs">
@@ -24,6 +36,7 @@
24
 
25
  </div>
26
 
 
27
 
28
  ## πŸ”” Important Note on Model Status
29
 
@@ -38,26 +51,15 @@ For RL fine-tuned version, please refer to [Data & models](https://github.com/ss
38
 
39
  **AdaReasoner-TC** series are trained through TC (Tool Cold Start) supervised fine-tuning only, without subsequent RL fine-tuning.
40
 
41
- We provide three variants of AdaReasoner-TC-7B, each optimized for different use cases:
42
-
43
- | Model | Description | Hugging Face |
44
- |------|-------------|--------------|
45
- | **AdaReasoner-TC-7B-Randomized** | Trained with the *adaptive learning* method, enabling strong generalization to **unseen tools and tasks**. Designed for open-ended and evolving tool environments where adaptability is required. | [πŸ€— Link](https://huggingface.co/AdaReasoner/AdaReasoner-TC-7B-Randomized) |
46
- | **AdaReasoner-TC-7B-Non-Randomized** | Trained **without adaptive learning**, providing **more stable and reliable performance on known tools and tasks**, but limited generalization to unseen tools or task settings. | [πŸ€— Link](https://huggingface.co/AdaReasoner/AdaReasoner-TC-7B-Non-Randomized) |
47
-
48
-
49
-
50
-
51
- **Key Differences:**
52
- - **Randomized**: Trained with adaptive learning method, enabling zero-shot generalization to novel tools and task configurations
53
- - **Non-Randomized**: Trained without adaptive learning, offering more predictable behavior on familiar tools but lacking generalization
54
-
55
 
 
 
 
56
 
57
  ## πŸ“Š Performance
58
 
59
- Please refer to our paper for detailed benchmark results across multiple visual reasoning tasks.
60
-
61
 
62
  ## πŸ“š Citation
63
 
@@ -82,4 +84,4 @@ This model is part of the AdaReasoner project. For more information, visit our [
82
 
83
  ## πŸ“§ Contact
84
 
85
- For questions and feedback, please open an issue in our [GitHub repository](https://github.com/ssmisya/AdaReasoner).
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: image-text-to-text
5
+ base_model: Qwen/Qwen2.5-VL-7B-Instruct
6
+ tags:
7
+ - visual-reasoning
8
+ - tool-use
9
+ - iterative-reasoning
10
+ - grpo
11
+ ---
12
+
13
  <div align="center">
14
+ <img src="https://huggingface.co/AdaReasoner/AdaReasoner-TC-7B-Randomized/resolve/main/docs/logo.png" alt="Logo" width="300">
15
  <h1 align="center">Dynamic Tool Orchestration for Iterative Visual Reasoning</h1>
16
 
17
+ <a href="https://huggingface.co/papers/2601.18631">
18
  <img src="https://img.shields.io/badge/Paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white" alt="Paper">
19
  </a>
20
  <a href="https://github.com/ssmisya/AdaReasoner/tree/main/docs">
 
36
 
37
  </div>
38
 
39
+ This repository contains **AdaReasoner-TC-7B-Randomized**, a variant of the model presented in [AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning](https://huggingface.co/papers/2601.18631).
40
 
41
  ## πŸ”” Important Note on Model Status
42
 
 
51
 
52
  **AdaReasoner-TC** series are trained through TC (Tool Cold Start) supervised fine-tuning only, without subsequent RL fine-tuning.
53
 
54
+ This specific variant, **AdaReasoner-TC-7B-Randomized**, is trained with the *adaptive learning* method, enabling strong generalization to **unseen tools and tasks**. It is designed for open-ended and evolving tool environments where adaptability is required.
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
+ **Key Differences between TC variants:**
57
+ - **Randomized**: Trained with adaptive learning method, enabling zero-shot generalization to novel tools and task configurations.
58
+ - **Non-Randomized**: Trained without adaptive learning, offering more predictable behavior on familiar tools but lacking generalization.
59
 
60
  ## πŸ“Š Performance
61
 
62
+ Please refer to our paper for detailed benchmark results across multiple visual reasoning tasks. AdaReasoner improves the 7B base model by +24.9% on average and surpasses strong proprietary systems such as GPT-5 on multiple tasks, including VSP and Jigsaw.
 
63
 
64
  ## πŸ“š Citation
65
 
 
84
 
85
  ## πŸ“§ Contact
86
 
87
+ For questions and feedback, please open an issue in our [GitHub repository](https://github.com/ssmisya/AdaReasoner).