colmodernvbert-base / README.md
QuentinJG's picture
Update README.md
1f247b4 verified
---
license: mit
library_name: colpali
language:
- en
tags:
- colpali
- vidore-experimental
- vidore
pipeline_tag: visual-document-retrieval
---
# ColModernVBERT
![bg](https://cdn-uploads.huggingface.co/production/uploads/661e945eebe3616a1b09e279/QfGYAqoq_TGcXRHh6UMaq.png)
## Usage
> [!WARNING]
> This version should not be used: it is solely the base version useful for deterministic LoRA initialization.
>
## Table of Contents
1. [Overview](#overview)
2. [Usage](#Usage)
3. [Evaluation](#Evaluation)
4. [License](#license)
5. [Citation](#citation)
## Overview
The [ModernVBERT](https://arxiv.org/abs/2510.01149) suite is a suite of compact 250M-parameter vision-language encoders, achieving state-of-the-art performance in this size class, matching the performance of models up to 10x larger.
For more information about ModernVBERT, please check the [arXiv](https://arxiv.org/abs/2510.01149) preprint.
### Models
- `ColModernVBERT` is the late-interaction version that is fine-tuned for visual document retrieval tasks, our most performant model on this task.
- `BiModernVBERT` is the bi-encoder version that is fine-tuned for visual document retrieval tasks.
- `ModernVBERT-embed` is the bi-encoder version after modality alignment (using a MLM objective) and contrastive learning, without document specialization.
- `ModernVBERT` is the base model after modality alignment (using a MLM objective).
## Evaluation
![table](https://cdn-uploads.huggingface.co/production/uploads/661e945eebe3616a1b09e279/NLB0bdE8tAAWXnCK6vjjS.png)
ColModernVBERT matches the performance of models nearly 10x larger on visual document benchmarks. Additionally, it provides an interesting inference speed on CPU compared to the models of similar performance.
## License
We release the ModernVBERT model architectures, model weights, and training codebase under the MIT license.
## Citation
If you use ModernVBERT in your work, please cite:
```
@misc{teiletche2025modernvbertsmallervisualdocument,
title={ModernVBERT: Towards Smaller Visual Document Retrievers},
author={Paul Teiletche and Quentin Macé and Max Conti and Antonio Loison and Gautier Viaud and Pierre Colombo and Manuel Faysse},
year={2025},
eprint={2510.01149},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2510.01149},
}
```
[More Information Needed]