royokong
/

e5-v

@@ -5,11 +5,15 @@ tags: []
 # [E5-V: Universal Embeddings with Multimodal Large Language Models](https://arxiv.org/abs/2407.12580)
 ## Overview
 We propose a framework, called E5-V, to adpat MLLMs for achieving multimodal embeddings. E5-V effectively bridges the modality gap between different types of inputs, demonstrating strong performance in multimodal embeddings even without fine-tuning. We also propose a single modality training approach for E5-V, where the model is trained exclusively on text pairs, demonstrating better performance than multimodal training.
 More details can be found in https://github.com/kongds/E5-V
 ## Example
 ``` python
 import torch

 # [E5-V: Universal Embeddings with Multimodal Large Language Models](https://arxiv.org/abs/2407.12580)
+E5-V is fine-tuned based on lmms-lab/llama3-llava-next-8b.
 ## Overview
 We propose a framework, called E5-V, to adpat MLLMs for achieving multimodal embeddings. E5-V effectively bridges the modality gap between different types of inputs, demonstrating strong performance in multimodal embeddings even without fine-tuning. We also propose a single modality training approach for E5-V, where the model is trained exclusively on text pairs, demonstrating better performance than multimodal training.
 More details can be found in https://github.com/kongds/E5-V
 ## Example
 ``` python
 import torch