Update pipeline tag and add library_name

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -1,23 +1,25 @@
1
  ---
2
- license: apache-2.0
3
  base_model: Qwen/Qwen2.5-VL-3B-Instruct
 
 
 
4
  tags:
5
  - vision-language
6
  - multimodal
7
  - reasoning
8
  - visual-grounding
9
  - computer-vision
10
- pipeline_tag: visual-question-answering
11
  ---
12
 
13
  # LaViT-3B: Aligning Latent Visual Thoughts for Multi-modal Reasoning
14
 
15
- <div align="center">
16
 
17
  **LaViT** is a vision-language model that aligns latent visual thoughts for enhanced multi-modal reasoning.
18
 
19
  [![Paper](https://img.shields.io/badge/Paper-arXiv:2601.10129-b31b1b.svg)](https://arxiv.org/abs/2601.10129)
20
  [![Model](https://img.shields.io/badge/🤗%20HuggingFace-Model-yellow.svg)](https://huggingface.co/Svard/LaViT-3B)
 
21
 
22
  </div>
23
 
@@ -130,4 +132,4 @@ This model is built upon [Qwen2.5-VL](https://github.com/QwenLM/Qwen3-VL) and in
130
 
131
  - **Paper**: [arXiv:2601.10129](https://arxiv.org/abs/2601.10129)
132
  - **Code Repository**: [GitHub](https://github.com/Svardfox/LaViT)
133
- - **Base Model**: [Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct)
 
1
  ---
 
2
  base_model: Qwen/Qwen2.5-VL-3B-Instruct
3
+ license: apache-2.0
4
+ pipeline_tag: image-text-to-text
5
+ library_name: transformers
6
  tags:
7
  - vision-language
8
  - multimodal
9
  - reasoning
10
  - visual-grounding
11
  - computer-vision
 
12
  ---
13
 
14
  # LaViT-3B: Aligning Latent Visual Thoughts for Multi-modal Reasoning
15
 
16
+ <div align="center\">
17
 
18
  **LaViT** is a vision-language model that aligns latent visual thoughts for enhanced multi-modal reasoning.
19
 
20
  [![Paper](https://img.shields.io/badge/Paper-arXiv:2601.10129-b31b1b.svg)](https://arxiv.org/abs/2601.10129)
21
  [![Model](https://img.shields.io/badge/🤗%20HuggingFace-Model-yellow.svg)](https://huggingface.co/Svard/LaViT-3B)
22
+ [![GitHub](https://img.shields.io/badge/GitHub-Code-blue?logo=github)](https://github.com/Svardfox/LaViT)
23
 
24
  </div>
25
 
 
132
 
133
  - **Paper**: [arXiv:2601.10129](https://arxiv.org/abs/2601.10129)
134
  - **Code Repository**: [GitHub](https://github.com/Svardfox/LaViT)
135
+ - **Base Model**: [Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct)