Explore More, Learn Better: Parallel VLM Embeddings with Mutual Information Regularization
This repository contains the official implementation of PDF-VLM2Vec, an efficient training framework for Vision-Language Model (VLM) embedding models.
Table of Contents
Installation
Our code has been tested on Python 3.10 and PyTorch 2.6.0.
Create a Conda environment:
conda create -n pdf_vlm2vec python=3.10 conda activate pdf_vlm2vecInstall dependencies:
pip install -r requirements.txt
Pre-trained Models
We provide our PDF-VLM2Vec models fine-tuned from Qwen2-VL. You can download them from the following links.
Data
Our training and evaluation data are from the MMEB benchmark. For more details, please refer to the original VLM2Vec repository.
- Training Data: Hugging Face Datasets
- Evaluation Data: Hugging Face Datasets
Download the datasets and place them in your preferred data directory.
Training
All training scripts are located in the scripts/train/ directory.
To train the PDF-VLM2Vec-Qwen2VL-2B model, follow these steps:
- Modify the script: Open
scripts/train/train_qwen2vl_2b.sh. - Update paths: Change the data path and model saving path to your local directories.
- Run the script:
source scripts/train/train_qwen2vl_2b.sh
Evaluation
Evaluation scripts are available in the scripts/eval/ directory.
To evaluate a trained model on the MMEB benchmark, for example the 2B model:
- Modify the script: Open
scripts/eval/eval_qwen2vl_2b.shand update theMODEL_PATHvariable to point to your trained model checkpoint. - Run the evaluation:
source scripts/eval/eval_qwen2vl_2b.sh
Results
For a comprehensive analysis, please refer to our paper.
Citation
If you find our work useful for your research, please consider citing our paper:
@article{wang2025explore,
title={Explore More, Learn Better: Parallel MLLM Embeddings under Mutual Information Minimization},
author={Wang, Zhicheng and Ju, Chen and Chen, Xu and Xiao, Shuai and Lan, Jinsong and Zhu, Xiaoyong and Chen, Ying and Cao, Zhiguo},
journal={arXiv preprint arXiv:2511.01588},
year={2025}
}
- Downloads last month
- 12