Video-Text-to-Text
Transformers
Safetensors
English
Chinese
qwen2_5_vl
image-text-to-text
video-understanding
multimodal
SWIM
Qwen2.5-VL
fine-grained-understanding
Eval Results (legacy)
text-generation-inference
Instructions to use BBBBCHAN/SWIM-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use BBBBCHAN/SWIM-7B with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("BBBBCHAN/SWIM-7B") model = AutoModelForImageTextToText.from_pretrained("BBBBCHAN/SWIM-7B") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,8 +1,8 @@
|
|
| 1 |
---
|
| 2 |
base_model:
|
|
|
|
| 3 |
- google/siglip-so400m-patch14-384
|
| 4 |
- Qwen/Qwen2.5-7B-Instruct
|
| 5 |
-
- Qwen/Qwen2.5-VL-7B-Instruct
|
| 6 |
datasets:
|
| 7 |
- lmms-lab/LLaVA-Video-178K
|
| 8 |
- DAMO-NLP-SG/VideoRefer-700K
|
|
|
|
| 1 |
---
|
| 2 |
base_model:
|
| 3 |
+
- Qwen/Qwen2.5-VL-7B-Instruct
|
| 4 |
- google/siglip-so400m-patch14-384
|
| 5 |
- Qwen/Qwen2.5-7B-Instruct
|
|
|
|
| 6 |
datasets:
|
| 7 |
- lmms-lab/LLaVA-Video-178K
|
| 8 |
- DAMO-NLP-SG/VideoRefer-700K
|