update figures and README
Browse files- .gitattributes +4 -0
- README.md +9 -5
- figures/global_pool.png +3 -0
- figures/local_pool.png +3 -0
- figures/zero_shot_image.png +3 -0
- figures/zero_shot_video.png +3 -0
.gitattributes
CHANGED
|
@@ -37,3 +37,7 @@ figures/figure1.png filter=lfs diff=lfs merge=lfs -text
|
|
| 37 |
figures/figure2.png filter=lfs diff=lfs merge=lfs -text
|
| 38 |
figures/figure3.png filter=lfs diff=lfs merge=lfs -text
|
| 39 |
figures/figure4.png filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
figures/figure2.png filter=lfs diff=lfs merge=lfs -text
|
| 38 |
figures/figure3.png filter=lfs diff=lfs merge=lfs -text
|
| 39 |
figures/figure4.png filter=lfs diff=lfs merge=lfs -text
|
| 40 |
+
figures/global_pool.png filter=lfs diff=lfs merge=lfs -text
|
| 41 |
+
figures/local_pool.png filter=lfs diff=lfs merge=lfs -text
|
| 42 |
+
figures/zero_shot_image.png filter=lfs diff=lfs merge=lfs -text
|
| 43 |
+
figures/zero_shot_video.png filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -26,11 +26,13 @@ Building on these insights, we introduce U-MARVEL (Universal Multimodal Retrieva
|
|
| 26 |
βββ checkpoints
|
| 27 |
β βββ hf_models
|
| 28 |
β β βββ Qwen2-VL-7B-Instruct
|
|
|
|
| 29 |
β βββ U-MARVEL-Qwen2VL-7B-Instruct
|
|
|
|
| 30 |
```
|
| 31 |
|
| 32 |
- [U-MARVEL-Qwen2VL-7B-Instruct](https://huggingface.co/TencentBAC/U-MARVEL-Qwen2VL-7B-Instruct) π€
|
| 33 |
-
- [
|
| 34 |
- Inference code available at: [U-MARVEL-inference](https://github.com/chaxjli/U-MARVEL)
|
| 35 |
## Requirements
|
| 36 |
|
|
@@ -44,6 +46,8 @@ pip install -r requirements.txt
|
|
| 44 |
|
| 45 |
Download Qwen2-VL-7B and place it in `./checkpoints/hf_models/Qwen2-VL-7B-Instruct`
|
| 46 |
|
|
|
|
|
|
|
| 47 |
For NLI dataset, please refer to [link](https://huggingface.co/datasets/princeton-nlp/datasets-for-simcse)
|
| 48 |
|
| 49 |
For multimodal instruction tuning datset, please refer to [M-BEIR](https://huggingface.co/datasets/TIGER-Lab/M-BEIR)
|
|
@@ -84,13 +88,13 @@ sh scripts/eval_zeroshot.sh # Evaluate zero-shot
|
|
| 84 |
The proposed U-MARVEL framework establishes new state-of-the-art performance across both
|
| 85 |
single-model architectures and recall-then-rerank approaches on M-BEIR benchmark.
|
| 86 |
|
| 87 |
-
<img src="./figures/
|
| 88 |
|
| 89 |
-
<img src="./figures/
|
| 90 |
|
| 91 |
-
<img src="./figures/
|
| 92 |
|
| 93 |
-
<img src="./figures/
|
| 94 |
|
| 95 |
|
| 96 |
## Acknowledgements
|
|
|
|
| 26 |
βββ checkpoints
|
| 27 |
β βββ hf_models
|
| 28 |
β β βββ Qwen2-VL-7B-Instruct
|
| 29 |
+
β β βββ Qwen3-VL-4B-Instruct
|
| 30 |
β βββ U-MARVEL-Qwen2VL-7B-Instruct
|
| 31 |
+
β βββ U-MARVEL-Qwen3VL-4B-Instruct
|
| 32 |
```
|
| 33 |
|
| 34 |
- [U-MARVEL-Qwen2VL-7B-Instruct](https://huggingface.co/TencentBAC/U-MARVEL-Qwen2VL-7B-Instruct) π€
|
| 35 |
+
- [U-MARVEL-Qwen3VL-4B-Instruct](https://huggingface.co/TencentBAC/U-MARVEL-Qwen3VL-4B-Instruct) π€
|
| 36 |
- Inference code available at: [U-MARVEL-inference](https://github.com/chaxjli/U-MARVEL)
|
| 37 |
## Requirements
|
| 38 |
|
|
|
|
| 46 |
|
| 47 |
Download Qwen2-VL-7B and place it in `./checkpoints/hf_models/Qwen2-VL-7B-Instruct`
|
| 48 |
|
| 49 |
+
Download Qwen3-VL-4B and place it in `./checkpoints/hf_models/Qwen3-VL-4B-Instruct`
|
| 50 |
+
|
| 51 |
For NLI dataset, please refer to [link](https://huggingface.co/datasets/princeton-nlp/datasets-for-simcse)
|
| 52 |
|
| 53 |
For multimodal instruction tuning datset, please refer to [M-BEIR](https://huggingface.co/datasets/TIGER-Lab/M-BEIR)
|
|
|
|
| 88 |
The proposed U-MARVEL framework establishes new state-of-the-art performance across both
|
| 89 |
single-model architectures and recall-then-rerank approaches on M-BEIR benchmark.
|
| 90 |
|
| 91 |
+
<img src="./figures/local_pool.png" alt="M-BEIR-Local" width="700" height="auto">
|
| 92 |
|
| 93 |
+
<img src="./figures/global_pool.png" alt="M-BEIR-Global" width="700" height="auto">
|
| 94 |
|
| 95 |
+
<img src="./figures/zero_shot_image.png" alt="M-BEIR-Zero-shot_image" width="700" height="auto">
|
| 96 |
|
| 97 |
+
<img src="./figures/zero_shot_video.png" alt="M-BEIR-Zero-shot_video" width="700" height="auto">
|
| 98 |
|
| 99 |
|
| 100 |
## Acknowledgements
|
figures/global_pool.png
ADDED
|
Git LFS Details
|
figures/local_pool.png
ADDED
|
Git LFS Details
|
figures/zero_shot_image.png
ADDED
|
Git LFS Details
|
figures/zero_shot_video.png
ADDED
|
Git LFS Details
|