BurgerCheng
/

PDF-VLM2Vec-Qwen2VL-2B

Model card Files Files and versions

BurgerCheng commited on 18 days ago

Commit

3ef4e50

·

verified ·

1 Parent(s): e94415b

Create README.md

Files changed (1) hide show

README.md +87 -0

README.md ADDED Viewed

	@@ -0,0 +1,87 @@

+# Explore More, Learn Better: Parallel VLM Embeddings with Mutual Information Regularization
+[![arXiv](https://img.shields.io/badge/arXiv-2511.01588-b31b1b.svg)](https://arxiv.org/abs/2511.01588)
+This repository contains the official implementation of **PDF-VLM2Vec**, an efficient training framework for Vision-Language Model (VLM) embedding models.
+## Table of Contents
+- [Installation](#installation)
+- [Pre-trained Models](#pre-trained-models)
+- [Data](#data)
+- [Training](#training)
+- [Evaluation](#evaluation)
+- [Results](#results)
+- [Citation](#citation)
+## Installation
+Our code has been tested on Python 3.10 and PyTorch 2.6.0.
+1.  **Create a Conda environment:**
+    ```bash
+    conda create -n pdf_vlm2vec python=3.10
+    conda activate pdf_vlm2vec
+    ```
+2.  **Install dependencies:**
+    ```bash
+    pip install -r requirements.txt
+    ```
+## Pre-trained Models
+We provide our PDF-VLM2Vec models fine-tuned from Qwen2-VL. You can download them from the following links.
+- [**PDF-VLM2Vec-Qwen2VL-2B**](https://huggingface.co/BurgerCheng/PDF-VLM2Vec-Qwen2VL-2B)
+- [**PDF-VLM2Vec-Qwen2VL-7B**](https://huggingface.co/BurgerCheng/PDF-VLM2Vec-Qwen2VL-7B)
+## Data
+Our training and evaluation data are from the **MMEB** benchmark. For more details, please refer to the original [VLM2Vec repository](https://github.com/TIGER-AI-Lab/VLM2Vec).
+- **Training Data**: [Hugging Face Datasets](https://huggingface.co/datasets/TIGER-Lab/MMEB-train)
+- **Evaluation Data**: [Hugging Face Datasets](https://huggingface.co/datasets/TIGER-Lab/MMEB-eval)
+Download the datasets and place them in your preferred data directory.
+## Training
+All training scripts are located in the `scripts/train/` directory.
+To train the **PDF-VLM2Vec-Qwen2VL-2B** model, follow these steps:
+1.  **Modify the script:** Open `scripts/train/train_qwen2vl_2b.sh`.
+2.  **Update paths:** Change the data path and model saving path to your local directories.
+3.  **Run the script:**
+    ```bash
+    source scripts/train/train_qwen2vl_2b.sh
+    ```
+## Evaluation
+Evaluation scripts are available in the `scripts/eval/` directory.
+To evaluate a trained model on the MMEB benchmark, for example the 2B model:
+1.  **Modify the script:** Open `scripts/eval/eval_qwen2vl_2b.sh` and update the `MODEL_PATH` variable to point to your trained model checkpoint.
+2.  **Run the evaluation:**
+    ```bash
+    source scripts/eval/eval_qwen2vl_2b.sh
+    ```
+## Results
+For a comprehensive analysis, please refer to our [paper](https://arxiv.org/abs/2511.01588).
+## Citation
+If you find our work useful for your research, please consider citing our paper:
+```bibtex
+@article{wang2025explore,
+  title={Explore More, Learn Better: Parallel MLLM Embeddings under Mutual Information Minimization},
+  author={Wang, Zhicheng and Ju, Chen and Chen, Xu and Xiao, Shuai and Lan, Jinsong and Zhu, Xiaoyong and Chen, Ying and Cao, Zhiguo},
+  journal={arXiv preprint arXiv:2511.01588},
+  year={2025}
+}
+```