Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,38 @@
|
|
| 1 |
-
---
|
| 2 |
-
license:
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
# Probing Visual Language Priors in VLMs
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
## ImageDPO Finetuned Model
|
| 9 |
+
|
| 10 |
+
This page provides the **ImageDPO** finetuned checkpoint for LLaVA-v1.5-7B used in [Probing Visual Language Priors in VLMs](https://arxiv.org/abs/2501.00569). We offer the **merged model weights** for use.
|
| 11 |
+
|
| 12 |
+
## Usage
|
| 13 |
+
|
| 14 |
+
First, install the [LLaVA-v1.5 codebase](https://github.com/LLaVA-VL/LLaVA-Plus-Codebase).
|
| 15 |
+
|
| 16 |
+
Run the following command to have a try:
|
| 17 |
+
|
| 18 |
+
```bash
|
| 19 |
+
python -m llava.eval.run_llava \
|
| 20 |
+
--model-path ViLP/LLaVA-v1.5-7b-ImageDPO \
|
| 21 |
+
--image-file 'images/llava_logo.png' \
|
| 22 |
+
--query 'Please caption this image.' \
|
| 23 |
+
--conv-mode llava_v1
|
| 24 |
+
```
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
## Citation Information
|
| 28 |
+
|
| 29 |
+
Please cite ***ViLP*** paper accordingly, if you find our resource helpful!
|
| 30 |
+
|
| 31 |
+
```bibtex
|
| 32 |
+
@article{luo2024probing,
|
| 33 |
+
title={Probing Visual Language Priors in VLMs},
|
| 34 |
+
author={Luo, Tiange and Cao, Ang and Lee, Gunhee and Johnson, Justin and Lee, Honglak},
|
| 35 |
+
journal={arXiv preprint arXiv:2501.00569},
|
| 36 |
+
year={2024}
|
| 37 |
+
}
|
| 38 |
+
```
|