Improve model card: Add metadata (pipeline_tag, library_name, license) and update paper link
Browse filesThis PR enhances the model card for the `bytedance-research/DynamicCoT` model by:
- Adding `pipeline_tag: image-to-text` to improve discoverability for multimodal tasks on the Hub.
- Adding `library_name: transformers` to enable the automated "How to use" widget, as the model is compatible with the 🤗 Transformers library, evidenced by installation instructions and model configuration.
- Adding `license: other` to reflect the specific "Qwen RESEARCH LICENSE AGREEMENT" mentioned in the model card content, which is not a standard SPDX license.
- Updating the existing paper badge to link directly to the Hugging Face paper page: [Boosting Multi-modal Keyphrase Prediction with Dynamic Chain-of-Thought in Vision-Language Models](https://huggingface.co/papers/2510.09358).
Please review and merge if these improvements are in line with the model's details.
|
@@ -1,8 +1,14 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# Boosting Multi-modal Keyphrase Prediction with Dynamic Chain-of-Thought in Vision-Language Models
|
| 2 |
|
| 3 |
<div align="center">
|
| 4 |
<p align="center">
|
| 5 |
-
<a>
|
| 6 |
<img
|
| 7 |
src="https://img.shields.io/badge/ArXiv-Paper-red?logo=arxiv&logoColor=red"
|
| 8 |
alt="Paper"
|
|
@@ -54,7 +60,7 @@ bash eval_full_sft.sh {/path/to/model} {/path/to/source_txt} --template {templat
|
|
| 54 |
```
|
| 55 |
|
| 56 |
## 🧾 License
|
| 57 |
-
DynamicCoT are derived from [Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct), which is subject to [Qwen RESEARCH LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct/blob/main/LICENSE). We retain ownership of all intellectual property rights in and to any derivative works and modifications that we made.
|
| 58 |
|
| 59 |
|
| 60 |
## 🙏 Acknowledgement
|
|
|
|
| 1 |
+
---
|
| 2 |
+
pipeline_tag: image-to-text
|
| 3 |
+
library_name: transformers
|
| 4 |
+
license: other
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
# Boosting Multi-modal Keyphrase Prediction with Dynamic Chain-of-Thought in Vision-Language Models
|
| 8 |
|
| 9 |
<div align="center">
|
| 10 |
<p align="center">
|
| 11 |
+
<a href="https://huggingface.co/papers/2510.09358">
|
| 12 |
<img
|
| 13 |
src="https://img.shields.io/badge/ArXiv-Paper-red?logo=arxiv&logoColor=red"
|
| 14 |
alt="Paper"
|
|
|
|
| 60 |
```
|
| 61 |
|
| 62 |
## 🧾 License
|
| 63 |
+
DynamicCoT are derived from [Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct/blob/main/LICENSE), which is subject to [Qwen RESEARCH LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct/blob/main/LICENSE). We retain ownership of all intellectual property rights in and to any derivative works and modifications that we made.
|
| 64 |
|
| 65 |
|
| 66 |
## 🙏 Acknowledgement
|