Update pipeline_tag and add library_name for R-4B
Browse filesThis PR improves the model card for R-4B by:
- Updating the `pipeline_tag` from `visual-question-answering` to `image-text-to-text`. This change better reflects the model's capabilities as a multimodal large language model that takes both image and text inputs to generate text, improving its discoverability on the Hugging Face Hub (e.g., at https://huggingface.co/models?pipeline_tag=image-text-to-text).
- Adding `library_name: transformers`. This is supported by the "Quickstart" section demonstrating usage with the `transformers` library, which will enable the automated "how to use" code snippet widget on the model page.
The existing links to the Arxiv paper and GitHub repository, as well as the sample usage code, are preserved as they are already well-documented.
|
@@ -1,11 +1,13 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
language:
|
| 4 |
-
- en
|
| 5 |
base_model:
|
| 6 |
- Qwen/Qwen3-4B
|
| 7 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
---
|
|
|
|
| 9 |
# R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning
|
| 10 |
|
| 11 |
[[π Arxiv Paper](https://arxiv.org/pdf/2508.21113)] [[π€ Hugging Face](https://huggingface.co/YannQi/R-4B)] [[π€οΈ ModelScope](https://huggingface.co/YannQi/R-4B)] [[π» Code](https://github.com/yannqi/R-4B)]
|
|
@@ -225,4 +227,4 @@ print("Chat response:", chat_response)
|
|
| 225 |
|
| 226 |
## Acknowledgements
|
| 227 |
|
| 228 |
-
R-4B is developed based on the codebases of the following projects: [LLaVA-Next](https://github.com/LLaVA-VL/LLaVA-NeXT), [SigLIP2](https://huggingface.co/google/siglip2-so400m-patch14-384), [Qwen3](https://github.com/QwenLM/Qwen3), [Qwen2.5-VL](https://github.com/QwenLM/Qwen2.5-VL), [VLMEvalKit](https://github.com/open-compass/VLMEvalKit). We sincerely thank these projects for their outstanding work.
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
base_model:
|
| 3 |
- Qwen/Qwen3-4B
|
| 4 |
+
language:
|
| 5 |
+
- en
|
| 6 |
+
license: apache-2.0
|
| 7 |
+
pipeline_tag: image-text-to-text
|
| 8 |
+
library_name: transformers
|
| 9 |
---
|
| 10 |
+
|
| 11 |
# R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning
|
| 12 |
|
| 13 |
[[π Arxiv Paper](https://arxiv.org/pdf/2508.21113)] [[π€ Hugging Face](https://huggingface.co/YannQi/R-4B)] [[π€οΈ ModelScope](https://huggingface.co/YannQi/R-4B)] [[π» Code](https://github.com/yannqi/R-4B)]
|
|
|
|
| 227 |
|
| 228 |
## Acknowledgements
|
| 229 |
|
| 230 |
+
R-4B is developed based on the codebases of the following projects: [LLaVA-Next](https://github.com/LLaVA-VL/LLaVA-NeXT), [SigLIP2](https://huggingface.co/google/siglip2-so400m-patch14-384), [Qwen3](https://github.com/QwenLM/Qwen3), [Qwen2.5-VL](https://github.com/QwenLM/Qwen2.5-VL), [VLMEvalKit](https://github.com/open-compass/VLMEvalKit). We sincerely thank these projects for their outstanding work.
|