File size: 635 Bytes
a934b50
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
This repository contains the ViLaVT-7B model as presented in Chatting with Images for Introspective Visual Thinking. Please refer to the code https://github.com/AntResearchNLP/ViLaVT.


If you find our work helpful, please consider citing our papers:

```
@misc{wu2026chattingimagesintrospectivevisual,
      title={Chatting with Images for Introspective Visual Thinking}, 
      author={Junfei Wu and Jian Guan and Qiang Liu and Shu Wu and Liang Wang and Wei Wu and Tieniu Tan},
      year={2026},
      eprint={2602.11073},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2602.11073}, 
}
```