File size: 2,387 Bytes
e80bfc9 d82fc1e e80bfc9 d82fc1e 2e8bc8f e80bfc9 2e8bc8f e80bfc9 2e8bc8f e80bfc9 2e8bc8f 4713ef8 e80bfc9 2e8bc8f e80bfc9 2e8bc8f e80bfc9 2e8bc8f e80bfc9 2e8bc8f e80bfc9 2e8bc8f e80bfc9 2e8bc8f e80bfc9 2e8bc8f e80bfc9 2e8bc8f e80bfc9 2e8bc8f e80bfc9 2e8bc8f e80bfc9 2e8bc8f e80bfc9 2e8bc8f e80bfc9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
---
license: mit
library_name: colpali
language:
- en
tags:
- colpali
- vidore-experimental
- vidore
pipeline_tag: visual-document-retrieval
---
# ColModernVBERT

## Usage
> [!WARNING]
> This version should not be used: it is solely the base version useful for deterministic LoRA initialization.
>
## Table of Contents
1. [Overview](#overview)
2. [Usage](#Usage)
3. [Evaluation](#Evaluation)
4. [License](#license)
5. [Citation](#citation)
## Overview
The [ModernVBERT](https://arxiv.org/abs/2510.01149) suite is a suite of compact 250M-parameter vision-language encoders, achieving state-of-the-art performance in this size class, matching the performance of models up to 10x larger.
For more information about ModernVBERT, please check the [arXiv](https://arxiv.org/abs/2510.01149) preprint.
### Models
- `ColModernVBERT` is the late-interaction version that is fine-tuned for visual document retrieval tasks, our most performant model on this task.
- `BiModernVBERT` is the bi-encoder version that is fine-tuned for visual document retrieval tasks.
- `ModernVBERT-embed` is the bi-encoder version after modality alignment (using a MLM objective) and contrastive learning, without document specialization.
- `ModernVBERT` is the base model after modality alignment (using a MLM objective).
## Evaluation

ColModernVBERT matches the performance of models nearly 10x larger on visual document benchmarks. Additionally, it provides an interesting inference speed on CPU compared to the models of similar performance.
## License
We release the ModernVBERT model architectures, model weights, and training codebase under the MIT license.
## Citation
If you use ModernVBERT in your work, please cite:
```
@misc{teiletche2025modernvbertsmallervisualdocument,
title={ModernVBERT: Towards Smaller Visual Document Retrievers},
author={Paul Teiletche and Quentin Macé and Max Conti and Antonio Loison and Gautier Viaud and Pierre Colombo and Manuel Faysse},
year={2025},
eprint={2510.01149},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2510.01149},
}
```
[More Information Needed] |