manu commited on
Commit
24a3757
·
verified ·
1 Parent(s): 9d38433

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -8
README.md CHANGED
@@ -14,15 +14,7 @@ pinned: true
14
 
15
  This organization contains all artefacts released with our preprint [ColPali: Efficient Document Retrieval with Vision Language Models.]() [TODO add link],
16
  including the [ViDoRe](https://huggingface.co/collections/vidore/vidore-benchmark-667173f98e70a1c0fa4db00d) benchmark and our SOTA document retrieval model [*ColPali*](https://huggingface.co/vidore/colpali).
17
- On top of that, we release two GitHub repositories:
18
- - the 1st respo contains the **training** scripts used to train ColPali: https://github.com/ManuelFay/colpali
19
- - the 2nd repos is a Python package to **evaluate** to evalaute and reproduce our results: https://github.com/tonywu71/vidore-benchmark
20
 
21
- The `vidore-benchmark` package can also be installed using:
22
-
23
- ```bash
24
- pip install -U vidore-benchmark
25
- ```
26
 
27
  ### Abstract
28
 
@@ -61,6 +53,11 @@ We organized datasets into collections to constitute our benchmark ViDoRe and it
61
 
62
  - [*Captioning Baseline*](https://huggingface.co/collections/vidore/vidore-captioning-baseline-6658a2a62d857c7a345195fd): Datasets in this collection are the same as in ViDoRe but preprocessed for textual retrieving. The original ViDoRe benchmark was passed to Unstructured to partition each page into chunks. Visual chunks are captioned using Claude Sonnet.
63
 
 
 
 
 
 
64
 
65
  ## Extra
66
 
 
14
 
15
  This organization contains all artefacts released with our preprint [ColPali: Efficient Document Retrieval with Vision Language Models.]() [TODO add link],
16
  including the [ViDoRe](https://huggingface.co/collections/vidore/vidore-benchmark-667173f98e70a1c0fa4db00d) benchmark and our SOTA document retrieval model [*ColPali*](https://huggingface.co/vidore/colpali).
 
 
 
17
 
 
 
 
 
 
18
 
19
  ### Abstract
20
 
 
53
 
54
  - [*Captioning Baseline*](https://huggingface.co/collections/vidore/vidore-captioning-baseline-6658a2a62d857c7a345195fd): Datasets in this collection are the same as in ViDoRe but preprocessed for textual retrieving. The original ViDoRe benchmark was passed to Unstructured to partition each page into chunks. Visual chunks are captioned using Claude Sonnet.
55
 
56
+ ## Code
57
+
58
+
59
+ - [*training*](https://github.com/ManuelFay/colpali): To train and use modelswit the ColPali architecture
60
+ - [*benchmarking*](https://github.com/tonywu71/vidore-benchmark): To evaluate document retrieval systems on the ViDoRe benchmark !
61
 
62
  ## Extra
63