|
|
--- |
|
|
title: README |
|
|
emoji: π |
|
|
colorFrom: pink |
|
|
colorTo: green |
|
|
sdk: static |
|
|
pinned: false |
|
|
--- |
|
|
|
|
|
# VLM2Vec & MMEB: Benchmarking multimodal embeddings and adapting state-of-the-art multimodal large language models into embedding models. |
|
|
|
|
|
- **Website** - https://tiger-ai-lab.github.io/VLM2Vec/ |
|
|
|
|
|
- **Github** https://github.com/TIGER-AI-Lab/VLM2Vec |
|
|
|
|
|
|
|
|
## List of Our Papers |
|
|
|
|
|
### Main VLM2Vec / MMEB Series |
|
|
- **[VLM2Vec / MMEB](https://arxiv.org/pdf/2410.05160)** β Image embedding benchmarking and models. (ICLR2025) |
|
|
- **[VLM2Vec-V2 / MMEB-V2](https://arxiv.org/pdf/2507.04590)** β Extension of our previous work to video and visual document tasks. (TMLR2026) |
|
|
|
|
|
### Other Related Papers from Our Team |
|
|
- **[GAE-Retriever](https://arxiv.org/pdf/2506.22056)** β Benchmark and model for trajectory modeling in GUI environments. (Computer-use Agents@ICML 2025) |
|
|
- **[B3](https://arxiv.org/pdf/2505.11293)** β A novel batch mining strategy for contrastive learning. (Neurips2025) |
|
|
|
|
|
|