Add metadata and link to paper

Hi! I'm Niels from the community science team at Hugging Face.

This PR adds YAML metadata to the model card, including:
- `pipeline_tag: image-text-to-text`: To ensure the model is correctly categorized in the Hub.
- `library_name: transformers`: To enable the "Use in Transformers" button, as the model uses the Qwen2.5-VL architecture.
- `license: apache-2.0`: Based on the information in the README.
- `base_model`: To link it to the Qwen2.5-VL model it was fine-tuned from.

I've also updated the paper link to point to the Hugging Face paper page, which allows users to easily find the associated paper and discussions.

Files changed (1) hide show

README.md +21 -19

README.md CHANGED Viewed

@@ -1,8 +1,20 @@
 <div align="center">
-  <img src="docs/logo.png" alt="Logo" width="300">
   <h1 align="center">Dynamic Tool Orchestration for Iterative Visual Reasoning</h1>
-  <a href="#">
     <img src="https://img.shields.io/badge/Paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white" alt="Paper">
   </a>
   <a href="https://github.com/ssmisya/AdaReasoner/tree/main/docs">
@@ -24,6 +36,7 @@
 </div>
 ## 🔔 Important Note on Model Status
@@ -38,26 +51,15 @@ For RL fine-tuned version, please refer to [Data & models](https://github.com/ss
 **AdaReasoner-TC** series are trained through TC (Tool Cold Start) supervised fine-tuning only, without subsequent RL fine-tuning.
-We provide three variants of AdaReasoner-TC-7B, each optimized for different use cases:
-| Model | Description | Hugging Face |
-|------|-------------|--------------|
-| **AdaReasoner-TC-7B-Randomized** | Trained with the *adaptive learning* method, enabling strong generalization to **unseen tools and tasks**. Designed for open-ended and evolving tool environments where adaptability is required. | [🤗 Link](https://huggingface.co/AdaReasoner/AdaReasoner-TC-7B-Randomized) |
-| **AdaReasoner-TC-7B-Non-Randomized** | Trained **without adaptive learning**, providing **more stable and reliable performance on known tools and tasks**, but limited generalization to unseen tools or task settings. | [🤗 Link](https://huggingface.co/AdaReasoner/AdaReasoner-TC-7B-Non-Randomized) |
-**Key Differences:**
-- **Randomized**: Trained with adaptive learning method, enabling zero-shot generalization to novel tools and task configurations
-- **Non-Randomized**: Trained without adaptive learning, offering more predictable behavior on familiar tools but lacking generalization
 ## 📊 Performance
-Please refer to our paper for detailed benchmark results across multiple visual reasoning tasks.
 ## 📚 Citation
@@ -82,4 +84,4 @@ This model is part of the AdaReasoner project. For more information, visit our [
 ## 📧 Contact
-For questions and feedback, please open an issue in our [GitHub repository](https://github.com/ssmisya/AdaReasoner).

+---
+license: apache-2.0
+library_name: transformers
+pipeline_tag: image-text-to-text
+base_model: Qwen/Qwen2.5-VL-7B-Instruct
+tags:
+- visual-reasoning
+- tool-use
+- iterative-reasoning
+- grpo
+---
 <div align="center">
+  <img src="https://huggingface.co/AdaReasoner/AdaReasoner-TC-7B-Randomized/resolve/main/docs/logo.png" alt="Logo" width="300">
   <h1 align="center">Dynamic Tool Orchestration for Iterative Visual Reasoning</h1>
+  <a href="https://huggingface.co/papers/2601.18631">
     <img src="https://img.shields.io/badge/Paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white" alt="Paper">
   </a>
   <a href="https://github.com/ssmisya/AdaReasoner/tree/main/docs">
 </div>
+This repository contains **AdaReasoner-TC-7B-Randomized**, a variant of the model presented in [AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning](https://huggingface.co/papers/2601.18631).
 ## 🔔 Important Note on Model Status
 **AdaReasoner-TC** series are trained through TC (Tool Cold Start) supervised fine-tuning only, without subsequent RL fine-tuning.
+This specific variant, **AdaReasoner-TC-7B-Randomized**, is trained with the *adaptive learning* method, enabling strong generalization to **unseen tools and tasks**. It is designed for open-ended and evolving tool environments where adaptability is required.
+**Key Differences between TC variants:**
+- **Randomized**: Trained with adaptive learning method, enabling zero-shot generalization to novel tools and task configurations.
+- **Non-Randomized**: Trained without adaptive learning, offering more predictable behavior on familiar tools but lacking generalization.
 ## 📊 Performance
+Please refer to our paper for detailed benchmark results across multiple visual reasoning tasks. AdaReasoner improves the 7B base model by +24.9% on average and surpasses strong proprietary systems such as GPT-5 on multiple tasks, including VSP and Jigsaw.
 ## 📚 Citation
 ## 📧 Contact
+For questions and feedback, please open an issue in our [GitHub repository](https://github.com/ssmisya/AdaReasoner).