nielsr HF Staff commited on
Commit
1e0b23d
Β·
verified Β·
1 Parent(s): b8f0e5c

Add metadata and link to paper

Browse files

Hi! I'm Niels from the community science team at Hugging Face.

This PR adds YAML metadata to the model card, including:
- `pipeline_tag: image-text-to-text`: To ensure the model is correctly categorized in the Hub.
- `library_name: transformers`: To enable the "Use in Transformers" button, as the model uses the Qwen2.5-VL architecture.
- `license: apache-2.0`: Based on the information in the README.
- `base_model`: To link it to the Qwen2.5-VL model it was fine-tuned from.

I've also updated the paper link to point to the Hugging Face paper page, which allows users to easily find the associated paper and discussions.

Files changed (1) hide show
  1. README.md +21 -19
README.md CHANGED
@@ -1,8 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
  <div align="center">
2
- <img src="docs/logo.png" alt="Logo" width="300">
3
  <h1 align="center">Dynamic Tool Orchestration for Iterative Visual Reasoning</h1>
4
 
5
- <a href="#">
6
  <img src="https://img.shields.io/badge/Paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white" alt="Paper">
7
  </a>
8
  <a href="https://github.com/ssmisya/AdaReasoner/tree/main/docs">
@@ -24,6 +36,7 @@
24
 
25
  </div>
26
 
 
27
 
28
  ## πŸ”” Important Note on Model Status
29
 
@@ -38,26 +51,15 @@ For RL fine-tuned version, please refer to [Data & models](https://github.com/ss
38
 
39
  **AdaReasoner-TC** series are trained through TC (Tool Cold Start) supervised fine-tuning only, without subsequent RL fine-tuning.
40
 
41
- We provide three variants of AdaReasoner-TC-7B, each optimized for different use cases:
42
-
43
- | Model | Description | Hugging Face |
44
- |------|-------------|--------------|
45
- | **AdaReasoner-TC-7B-Randomized** | Trained with the *adaptive learning* method, enabling strong generalization to **unseen tools and tasks**. Designed for open-ended and evolving tool environments where adaptability is required. | [πŸ€— Link](https://huggingface.co/AdaReasoner/AdaReasoner-TC-7B-Randomized) |
46
- | **AdaReasoner-TC-7B-Non-Randomized** | Trained **without adaptive learning**, providing **more stable and reliable performance on known tools and tasks**, but limited generalization to unseen tools or task settings. | [πŸ€— Link](https://huggingface.co/AdaReasoner/AdaReasoner-TC-7B-Non-Randomized) |
47
-
48
-
49
-
50
-
51
- **Key Differences:**
52
- - **Randomized**: Trained with adaptive learning method, enabling zero-shot generalization to novel tools and task configurations
53
- - **Non-Randomized**: Trained without adaptive learning, offering more predictable behavior on familiar tools but lacking generalization
54
-
55
 
 
 
 
56
 
57
  ## πŸ“Š Performance
58
 
59
- Please refer to our paper for detailed benchmark results across multiple visual reasoning tasks.
60
-
61
 
62
  ## πŸ“š Citation
63
 
@@ -82,4 +84,4 @@ This model is part of the AdaReasoner project. For more information, visit our [
82
 
83
  ## πŸ“§ Contact
84
 
85
- For questions and feedback, please open an issue in our [GitHub repository](https://github.com/ssmisya/AdaReasoner).
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: image-text-to-text
5
+ base_model: Qwen/Qwen2.5-VL-7B-Instruct
6
+ tags:
7
+ - visual-reasoning
8
+ - tool-use
9
+ - iterative-reasoning
10
+ - grpo
11
+ ---
12
+
13
  <div align="center">
14
+ <img src="https://huggingface.co/AdaReasoner/AdaReasoner-TC-7B-Randomized/resolve/main/docs/logo.png" alt="Logo" width="300">
15
  <h1 align="center">Dynamic Tool Orchestration for Iterative Visual Reasoning</h1>
16
 
17
+ <a href="https://huggingface.co/papers/2601.18631">
18
  <img src="https://img.shields.io/badge/Paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white" alt="Paper">
19
  </a>
20
  <a href="https://github.com/ssmisya/AdaReasoner/tree/main/docs">
 
36
 
37
  </div>
38
 
39
+ This repository contains **AdaReasoner-TC-7B-Randomized**, a variant of the model presented in [AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning](https://huggingface.co/papers/2601.18631).
40
 
41
  ## πŸ”” Important Note on Model Status
42
 
 
51
 
52
  **AdaReasoner-TC** series are trained through TC (Tool Cold Start) supervised fine-tuning only, without subsequent RL fine-tuning.
53
 
54
+ This specific variant, **AdaReasoner-TC-7B-Randomized**, is trained with the *adaptive learning* method, enabling strong generalization to **unseen tools and tasks**. It is designed for open-ended and evolving tool environments where adaptability is required.
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
+ **Key Differences between TC variants:**
57
+ - **Randomized**: Trained with adaptive learning method, enabling zero-shot generalization to novel tools and task configurations.
58
+ - **Non-Randomized**: Trained without adaptive learning, offering more predictable behavior on familiar tools but lacking generalization.
59
 
60
  ## πŸ“Š Performance
61
 
62
+ Please refer to our paper for detailed benchmark results across multiple visual reasoning tasks. AdaReasoner improves the 7B base model by +24.9% on average and surpasses strong proprietary systems such as GPT-5 on multiple tasks, including VSP and Jigsaw.
 
63
 
64
  ## πŸ“š Citation
65
 
 
84
 
85
  ## πŸ“§ Contact
86
 
87
+ For questions and feedback, please open an issue in our [GitHub repository](https://github.com/ssmisya/AdaReasoner).