Update README.md
Browse files
README.md
CHANGED
|
@@ -51,7 +51,8 @@ We release under the Apache 2.0 license 2 checkpoints:
|
|
| 51 |
- **Resources for more information:**
|
| 52 |
- Description of [OBELICS](https://huggingface.co/datasets/HuggingFaceM4/OBELICS): [OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
|
| 53 |
](https://huggingface.co/papers/2306.16527)
|
| 54 |
-
- Paper:
|
|
|
|
| 55 |
|
| 56 |
|
| 57 |
# Uses
|
|
@@ -439,6 +440,15 @@ The model is built on top of two pre-trained models: [google/siglip-so400m-patch
|
|
| 439 |
archivePrefix={arXiv},
|
| 440 |
primaryClass={cs.IR}
|
| 441 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 442 |
```
|
| 443 |
|
| 444 |
# Acknowledgements
|
|
|
|
| 51 |
- **Resources for more information:**
|
| 52 |
- Description of [OBELICS](https://huggingface.co/datasets/HuggingFaceM4/OBELICS): [OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
|
| 53 |
](https://huggingface.co/papers/2306.16527)
|
| 54 |
+
- Paper: [What matters when building vision-language models?
|
| 55 |
+
](https://huggingface.co/papers/2405.02246)
|
| 56 |
|
| 57 |
|
| 58 |
# Uses
|
|
|
|
| 440 |
archivePrefix={arXiv},
|
| 441 |
primaryClass={cs.IR}
|
| 442 |
}
|
| 443 |
+
|
| 444 |
+
@misc{laurençon2024matters,
|
| 445 |
+
title={What matters when building vision-language models?},
|
| 446 |
+
author={Hugo Laurençon and Léo Tronchon and Matthieu Cord and Victor Sanh},
|
| 447 |
+
year={2024},
|
| 448 |
+
eprint={2405.02246},
|
| 449 |
+
archivePrefix={arXiv},
|
| 450 |
+
primaryClass={cs.CV}
|
| 451 |
+
}
|
| 452 |
```
|
| 453 |
|
| 454 |
# Acknowledgements
|