| base_model: Qwen/Qwen2-VL-7B-Instruct | |
| library_name: peft | |
| pipeline_tag: image-text-to-text | |
| ## Model Details | |
| Pretrained adapter for ABC: Acheiving Better Control of Multiomodal Embeddings using VLMs. | |
| ### Model Sources | |
| This model is trained on top of Qwen2VL-Instruct. | |
| ### Paper and Website | |
| For more information, please refer to [Website](https://tiger-ai-lab.github.io/ABC/). | |
| Code: https://github.com/TIGER-AI-Lab/ABC | |
| ## Citation | |
| <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. --> | |
| ``` | |
| @misc{schneider2025abcachievingbettercontrol, | |
| title={ABC: Achieving Better Control of Multimodal Embeddings using VLMs}, | |
| author={Benjamin Schneider and Florian Kerschbaum and Wenhu Chen}, | |
| year={2025}, | |
| eprint={2503.00329}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CV}, | |
| url={https://arxiv.org/abs/2503.00329}, | |
| } | |
| ``` |