Improve model card: Add pipeline tag, library name, and refine content

This PR enhances the model card for **Reason-RFT** by:
- Adding the `pipeline_tag: image-text-to-text` to improve discoverability on the Hugging Face Hub, as this model performs visual reasoning.
- Including `library_name: transformers`, as evidenced by the model's configuration files (e.g., `config.json` referencing `Qwen2VLForConditionalGeneration`), ensuring compatibility with the automated "How to use" widget.
- Updating the introductory text to clearly state the model's purpose and link to its original paper.
- Refining the "Usage" section to more explicitly guide users to the GitHub repository for detailed instructions and examples.

Files changed (1) hide show

README.md +23 -7

README.md CHANGED Viewed

@@ -1,13 +1,15 @@
 ---
-license: apache-2.0
-language:
-- en
 datasets:
 - tanhuajie2001/Reason-RFT-CoT-Dataset
 metrics:
 - accuracy
-base_model:
-- Qwen/Qwen2-VL-2B-Instruct
 ---
 <div align="center">
@@ -15,7 +17,7 @@ base_model:
 </div>
 # 🤗 Reason-RFT CoT Dateset
-*The model checkpoints in our project "Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning"*.
 <p align="center">
@@ -63,7 +65,7 @@ Experimental results demonstrate Reasoning-RFT's three key advantages: **(1) Per
 ## ⭐️ Usage
-*Please refer to [Reason-RFT](https://github.com/tanhuajie/Reason-RFT) for more details.*
 ## 📑 Citation
 If you find this project useful, welcome to cite us.
@@ -74,4 +76,18 @@ If you find this project useful, welcome to cite us.
   journal={arXiv preprint arXiv:2503.20752},
   year={2025}
 }
 ```

 ---
+base_model:
+- Qwen/Qwen2-VL-2B-Instruct
 datasets:
 - tanhuajie2001/Reason-RFT-CoT-Dataset
+language:
+- en
+license: apache-2.0
 metrics:
 - accuracy
+pipeline_tag: image-text-to-text
+library_name: transformers
 ---
 <div align="center">
 </div>
 # 🤗 Reason-RFT CoT Dateset
+*This repository hosts the model checkpoints for the project "Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Models", as presented in the paper [Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Models](https://arxiv.org/abs/2503.20752).*
 <p align="center">
 ## ⭐️ Usage
+For detailed instructions on how to use the models, including inference code and setup, please refer to the [Reason-RFT GitHub repository](https://github.com/tanhuajie/Reason-RFT#--usage).
 ## 📑 Citation
 If you find this project useful, welcome to cite us.
   journal={arXiv preprint arXiv:2503.20752},
   year={2025}
 }
+@article{team2025robobrain,
+  title={Robobrain 2.0 technical report},
+  author={Team, BAAI RoboBrain and Cao, Mingyu and Tan, Huajie and Ji, Yuheng and Lin, Minglan and Li, Zhiyu and Cao, Zhou and Wang, Pengwei and Zhou, Enshen and Han, Yi and others},
+  journal={arXiv preprint arXiv:2507.02029},
+  year={2025}
+}
+@article{ji2025robobrain,
+  title={RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete},
+  author={Ji, Yuheng and Tan, Huajie and Shi, Jiayu and Hao, Xiaoshuai and Zhang, Yuan and Zhang, Hengyuan and Wang, Pengwei and Zhao, Mengdi and Mu, Yao and An, Pengju and others},
+  journal={arXiv preprint arXiv:2502.21257},
+  year={2025}
+}
 ```