internlm
/

CapRL-3B

Image-Text-to-Text

text-generation-inference

Model card Files Files and versions

yuhangzang commited on Oct 16, 2025

Commit

7552cbe

·

verified ·

1 Parent(s): be6e330

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -14,12 +14,14 @@ tags:
 📖<a href="https://arxiv.org/abs/2509.22647">Paper</a> | 🏠<a href="https://github.com/InternLM/CapRL">Github</a> |🤗<a href="https://huggingface.co/internlm/CapRL-3B">CapRL-3B Model</a> |🤗<a href="https://huggingface.co/yuhangzang/CapRL-InternVL3.5-8B">CapRL-InternVL3.5-8B Model</a> |
   🤗<a href="https://huggingface.co/datasets/internlm/CapRL-2M">CapRL-2M Dataset</a>
-  🤗<a href="https://huggingface.co/collections/long-xing1/caprl-68d64ac32ded31596c36e189">CapRL Collection</a> | 🤗<a href="https://huggingface.co/papers/2509.22647">Daily Paper</a>
 Based on the same recipe as CapRL-3B, we used InternVL3.5-8B as the policy model and obtained **CapRL-InternVL3.5-8B** through CapRL. **Its performance significantly surpasses that of Qwen2.5-VL-72B**.
 We are working on even stronger base models and upgrading our training recipe — stay tuned!
 ## Introduction
 We are excited to introduce CapRL-3B, a lightweight 3B image captioner that achieves perception capabilities comparable to Qwen2.5-VL-72B.

 📖<a href="https://arxiv.org/abs/2509.22647">Paper</a> | 🏠<a href="https://github.com/InternLM/CapRL">Github</a> |🤗<a href="https://huggingface.co/internlm/CapRL-3B">CapRL-3B Model</a> |🤗<a href="https://huggingface.co/yuhangzang/CapRL-InternVL3.5-8B">CapRL-InternVL3.5-8B Model</a> |
   🤗<a href="https://huggingface.co/datasets/internlm/CapRL-2M">CapRL-2M Dataset</a>
+  🤗<a href="https://huggingface.co/collections/long-xing1/caprl-68d64ac32ded31596c36e189">CapRL Collection</a> | 🤗<a href="https://huggingface.co/papers/2509.22647">Daily Paper</a> ｜🤗<a href="https://huggingface.co/mradermacher/CapRL-3B-GGUF">CapRL-3B-GGUF</a> ｜🤗<a href="https://huggingface.co/mradermacher/CapRL-3B-i1-GGUF">CapRL-3B-i1-GGUF</a>
 Based on the same recipe as CapRL-3B, we used InternVL3.5-8B as the policy model and obtained **CapRL-InternVL3.5-8B** through CapRL. **Its performance significantly surpasses that of Qwen2.5-VL-72B**.
 We are working on even stronger base models and upgrading our training recipe — stay tuned!
+CapRL-3B-GGUF is static quants version, and CapRL-3B-i1-GGUF is weighted/imatrix quants version. Thanks for their contribution!
 ## Introduction
 We are excited to introduce CapRL-3B, a lightweight 3B image captioner that achieves perception capabilities comparable to Qwen2.5-VL-72B.