| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | pipeline_tag: image-text-to-text |
| | tags: |
| | - multimodal |
| | library_name: transformers |
| | base_model: |
| | - Qwen/Qwen2-VL-7B |
| | new_version: Qwen/Qwen2.5-VL-7B-Instruct |
| | --- |
| | |
| | <!-- header start --> |
| | <p align="center"> |
| | <img src="https://huggingface.co/datasets/FriendliAI/documentation-images/resolve/main/model-card-assets/friendliai.png" width="100%" alt="FriendliAI Logo"> |
| | </p> |
| | <!-- header end --> |
| |
|
| |
|
| | # Qwen/Qwen2-VL-7B-Instruct |
| |
|
| | * Model creator: [Qwen](https://huggingface.co/Qwen) |
| | * Original model: [Qwen2-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct) |
| |
|
| | ## Differences |
| |
|
| | * Added missing `<|image_pad|>` and `<|video_pad|>` tokens to tokenizer.json |
| |
|
| | ## License |
| |
|
| | Refer to the license of the original model card. |
| |
|
| |
|