Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -51,13 +51,14 @@ We organized datasets into collections to constitute our benchmark ViDoRe and it
|
|
| 51 |
|
| 52 |
- [*Captioning Baseline*](https://huggingface.co/collections/vidore/vidore-captioning-baseline-6658a2a62d857c7a345195fd): Datasets in this collection are the same as in ViDoRe but preprocessed for textual retrieving. The original ViDoRe benchmark was passed to Unstructured to partition each page into chunks. Visual chunks are captioned using Claude Sonnet.
|
| 53 |
|
|
|
|
| 54 |
## Intended use
|
| 55 |
|
| 56 |
You can either load a specific dataset using the standard `load_dataset` function from huggingface.
|
| 57 |
|
| 58 |
```python
|
| 59 |
from datasets import load_dataset
|
| 60 |
-
dataset = load_dataset(
|
| 61 |
```
|
| 62 |
|
| 63 |
To use the whole benchmark, you can list the datasets in the collection using the following snippet.
|
|
|
|
| 51 |
|
| 52 |
- [*Captioning Baseline*](https://huggingface.co/collections/vidore/vidore-captioning-baseline-6658a2a62d857c7a345195fd): Datasets in this collection are the same as in ViDoRe but preprocessed for textual retrieving. The original ViDoRe benchmark was passed to Unstructured to partition each page into chunks. Visual chunks are captioned using Claude Sonnet.
|
| 53 |
|
| 54 |
+
|
| 55 |
## Intended use
|
| 56 |
|
| 57 |
You can either load a specific dataset using the standard `load_dataset` function from huggingface.
|
| 58 |
|
| 59 |
```python
|
| 60 |
from datasets import load_dataset
|
| 61 |
+
dataset = load_dataset(<dataset>)
|
| 62 |
```
|
| 63 |
|
| 64 |
To use the whole benchmark, you can list the datasets in the collection using the following snippet.
|