Spaces:

vidore
/

README

Running

App Files Files Community

HugSib commited on Jun 25, 2024

Commit

4d1e895

verified ·

1 Parent(s): 826c353

Update README.md

Browse files

Files changed (1) hide show

README.md +16 -8

README.md CHANGED Viewed

@@ -31,13 +31,15 @@ Combined with a late interaction matching mechanism, *ColPali* largely outperfor
 ## Organisation
-### Models [add description of released model]
-  - [*ColPali*](https://huggingface.co/vidore/colpali): TODO
-  - [*BiPali*](https://huggingface.co/vidore/bipali): TODO
-  - [*BiSigLip*](https://huggingface.co/vidore/bisiglip): TODO
 ### Datasets
@@ -61,7 +63,7 @@ You can either load a specific dataset using the standard `load_dataset` functio
   dataset = load_dataset(dataset_item.item_id)
 ```
-To use the whole benchmark you can list the datasets in the collection using the following snippet.
 ```python
   from datasets import load_dataset
@@ -81,12 +83,18 @@ To use the whole benchmark you can list the datasets in the collection using the
 ```
 ## Autorship + Citation
-TODO : Contact
-If you use any datasets or models from this organisation in your research, please cite the original dataset as follows:
 **BibTeX Citation**
 ```latex
     [include BibTeX]
 ```

 ## Organisation
+### Models
+  - [*ColPali*](https://huggingface.co/vidore/colpali): *ColPali* is our main contribution, it is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs), to efficiently index documents from their visual features.
+  It is a [PaliGemma-3B](https://huggingface.co/google/paligemma-3b-mix-448) extension that generates [ColBERT](https://arxiv.org/abs/2004.12832)- style multi-vector representations of text and images.
+  - [*BiPali*](https://huggingface.co/vidore/bipali): It is an extension of original SigLip architecture, the SigLIP-generated patch embeddings are fed to a text language model, PaliGemma-3B, to obtain LLM contextualized output patch embeddings.
+  These representations are pool-averaged to get a single vector representation and create a PaliGemma bi-encoder, *BiPali*.
+  - [*BiSigLip*](https://huggingface.co/vidore/bisiglip): Finetuned version of original [SigLip](https://huggingface.co/google/siglip-so400m-patch14-384), a strong vision-language bi-encoder model.
 ### Datasets
   dataset = load_dataset(dataset_item.item_id)
 ```
+To use the whole benchmark, you can list the datasets in the collection using the following snippet.
 ```python
   from datasets import load_dataset
 ```
 ## Autorship + Citation
+**Contact**
+Please report any issues with the models or the benchmark or contact us:
+- Manuel Faysse : [email?]()
+- Hugues Sibille : [email?]()
+- Tony Wu : [email?]()
 **BibTeX Citation**
+If you use any datasets or models from this organisation in your research, please cite the original dataset as follows:
 ```latex
     [include BibTeX]
 ```