Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -29,14 +29,27 @@ Combined with a late interaction matching mechanism, *ColPali* largely outperfor
|
|
| 29 |
|
| 30 |
## Organisation
|
| 31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
### Datasets
|
| 33 |
We organized datasets into collections to constitute our benchmark ViDoRe and its derivates (OCR and Captioning). Below is a brief description of each of them.
|
| 34 |
- [*ViDoRe Benchmark*](https://huggingface.co/collections/vidore/vidore-benchmark-667173f98e70a1c0fa4db00d): collection regrouping all datasets constituting the ViDoRe benchmark. It includes the test sets from different academic
|
| 35 |
datasets ([ArXiVQA](https://huggingface.co/datasets/vidore/arxivqa_test_subsampled), [DocVQA](https://huggingface.co/datasets/vidore/docvqa_test_subsampled),
|
| 36 |
[InfoVQA](https://huggingface.co/datasets/vidore/infovqa_test_subsampled), [TATDQA](https://huggingface.co/datasets/vidore/tatdqa_test), [TabFQuAD](https://huggingface.co/datasets/vidore/tabfquad_test_subsampled)) and from datasets synthetically generated spanning various themes and industrial applications:
|
| 37 |
-
([Artificial Intelligence](https://huggingface.co/datasets/vidore/syntheticDocQA_artificial_intelligence_test), [Government Reports](https://huggingface.co/datasets/vidore/syntheticDocQA_government_reports_test), [Healthcare Industry](https://huggingface.co/datasets/vidore/syntheticDocQA_healthcare_industry_test), [Energy](https://huggingface.co/datasets/vidore/syntheticDocQA_energy_test) and [Shift Project](https://huggingface.co/datasets/vidore/shiftproject_test)).
|
| 38 |
-
|
| 39 |
-
|
|
|
|
|
|
|
|
|
|
| 40 |
|
| 41 |
**Intended use**
|
| 42 |
|
|
@@ -65,14 +78,6 @@ To use the whole benchmark you can list the datasets in the collection using the
|
|
| 65 |
datasets.append(dataset)
|
| 66 |
|
| 67 |
```
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
### Models [add description of released model]
|
| 71 |
-
- [*ColPali*](https://huggingface.co/vidore/colpali): TODO
|
| 72 |
-
- [*BiPali*](https://huggingface.co/vidore/bipali): TODO
|
| 73 |
-
- [*BiSigLip*](https://huggingface.co/vidore/bisiglip): TODO
|
| 74 |
-
|
| 75 |
-
|
| 76 |
## Autorship + Citation
|
| 77 |
|
| 78 |
TODO : Contact
|
|
|
|
| 29 |
|
| 30 |
## Organisation
|
| 31 |
|
| 32 |
+
|
| 33 |
+
### Models [add description of released model]
|
| 34 |
+
- [*ColPali*](https://huggingface.co/vidore/colpali): TODO
|
| 35 |
+
|
| 36 |
+
- [*BiPali*](https://huggingface.co/vidore/bipali): TODO
|
| 37 |
+
|
| 38 |
+
|
| 39 |
+
- [*BiSigLip*](https://huggingface.co/vidore/bisiglip): TODO
|
| 40 |
+
|
| 41 |
+
|
| 42 |
### Datasets
|
| 43 |
We organized datasets into collections to constitute our benchmark ViDoRe and its derivates (OCR and Captioning). Below is a brief description of each of them.
|
| 44 |
- [*ViDoRe Benchmark*](https://huggingface.co/collections/vidore/vidore-benchmark-667173f98e70a1c0fa4db00d): collection regrouping all datasets constituting the ViDoRe benchmark. It includes the test sets from different academic
|
| 45 |
datasets ([ArXiVQA](https://huggingface.co/datasets/vidore/arxivqa_test_subsampled), [DocVQA](https://huggingface.co/datasets/vidore/docvqa_test_subsampled),
|
| 46 |
[InfoVQA](https://huggingface.co/datasets/vidore/infovqa_test_subsampled), [TATDQA](https://huggingface.co/datasets/vidore/tatdqa_test), [TabFQuAD](https://huggingface.co/datasets/vidore/tabfquad_test_subsampled)) and from datasets synthetically generated spanning various themes and industrial applications:
|
| 47 |
+
([Artificial Intelligence](https://huggingface.co/datasets/vidore/syntheticDocQA_artificial_intelligence_test), [Government Reports](https://huggingface.co/datasets/vidore/syntheticDocQA_government_reports_test), [Healthcare Industry](https://huggingface.co/datasets/vidore/syntheticDocQA_healthcare_industry_test), [Energy](https://huggingface.co/datasets/vidore/syntheticDocQA_energy_test) and [Shift Project](https://huggingface.co/datasets/vidore/shiftproject_test)).
|
| 48 |
+
Further details can be found on the corresponding dataset cards.
|
| 49 |
+
|
| 50 |
+
- [*OCR Baseline*](https://huggingface.co/collections/vidore/vidore-chunk-ocr-baseline-666acce88c294ef415548a56): Datasets in this collection are the same as in ViDoRe but preprocessed for textual retrieving. The original ViDoRe benchmark was passed to Unstructured to partition each page into chunks. Visual chunks are OCRized with Tesseract.
|
| 51 |
+
|
| 52 |
+
- [*Captioning Baseline*](https://huggingface.co/collections/vidore/vidore-captioning-baseline-6658a2a62d857c7a345195fd): Datasets in this collection are the same as in ViDoRe but preprocessed for textual retrieving. The original ViDoRe benchmark was passed to Unstructured to partition each page into chunks. Visual chunks are captioned using Claude Sonnet.
|
| 53 |
|
| 54 |
**Intended use**
|
| 55 |
|
|
|
|
| 78 |
datasets.append(dataset)
|
| 79 |
|
| 80 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 81 |
## Autorship + Citation
|
| 82 |
|
| 83 |
TODO : Contact
|