Video-Text-to-Text
Transformers
Safetensors
English
videochat_flash_qwen
feature-extraction
multimodal
custom_code
Eval Results (legacy)
Instructions to use OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448 with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Update config.json
Browse filesupdate to transformers==4.40.1
- config.json +2 -2
config.json
CHANGED
|
@@ -196,7 +196,7 @@
|
|
| 196 |
"mm_vision_select_layer": -2,
|
| 197 |
"mm_vision_tower": "umt-hd-large",
|
| 198 |
"mm_vision_tower_lr": 2e-06,
|
| 199 |
-
"model_type": "
|
| 200 |
"num_attention_heads": 12,
|
| 201 |
"num_hidden_layers": 28,
|
| 202 |
"num_key_value_heads": 2,
|
|
@@ -209,7 +209,7 @@
|
|
| 209 |
"tokenizer_model_max_length": 32768,
|
| 210 |
"tokenizer_padding_side": "right",
|
| 211 |
"torch_dtype": "bfloat16",
|
| 212 |
-
"transformers_version": "4.
|
| 213 |
"use_cache": true,
|
| 214 |
"use_mm_proj": true,
|
| 215 |
"use_pos_skipping": false,
|
|
|
|
| 196 |
"mm_vision_select_layer": -2,
|
| 197 |
"mm_vision_tower": "umt-hd-large",
|
| 198 |
"mm_vision_tower_lr": 2e-06,
|
| 199 |
+
"model_type": "videochat_flash_qwen",
|
| 200 |
"num_attention_heads": 12,
|
| 201 |
"num_hidden_layers": 28,
|
| 202 |
"num_key_value_heads": 2,
|
|
|
|
| 209 |
"tokenizer_model_max_length": 32768,
|
| 210 |
"tokenizer_padding_side": "right",
|
| 211 |
"torch_dtype": "bfloat16",
|
| 212 |
+
"transformers_version": "4.40.1",
|
| 213 |
"use_cache": true,
|
| 214 |
"use_mm_proj": true,
|
| 215 |
"use_pos_skipping": false,
|