ModernVBERT
/

colmodernvbert-base

Visual Document Retrieval

vidore-experimental

Model card Files Files and versions

colmodernvbert-base / README.md

QuentinJG's picture

Update README.md

1f247b4 verified 4 months ago

|

history blame contribute delete

2.39 kB

	---
	license: mit
	library_name: colpali
	language:
	- en
	tags:
	- colpali
	- vidore-experimental
	- vidore
	pipeline_tag: visual-document-retrieval
	---


	# ColModernVBERT

	![bg](https://cdn-uploads.huggingface.co/production/uploads/661e945eebe3616a1b09e279/QfGYAqoq_TGcXRHh6UMaq.png)

	## Usage

	> [!WARNING]
	> This version should not be used: it is solely the base version useful for deterministic LoRA initialization.
	>

	## Table of Contents
	1. [Overview](#overview)
	2. [Usage](#Usage)
	3. [Evaluation](#Evaluation)
	4. [License](#license)
	5. [Citation](#citation)

	## Overview

	The [ModernVBERT](https://arxiv.org/abs/2510.01149) suite is a suite of compact 250M-parameter vision-language encoders, achieving state-of-the-art performance in this size class, matching the performance of models up to 10x larger.

	For more information about ModernVBERT, please check the [arXiv](https://arxiv.org/abs/2510.01149) preprint.

	### Models
	- `ColModernVBERT` is the late-interaction version that is fine-tuned for visual document retrieval tasks, our most performant model on this task.
	- `BiModernVBERT` is the bi-encoder version that is fine-tuned for visual document retrieval tasks.
	- `ModernVBERT-embed` is the bi-encoder version after modality alignment (using a MLM objective) and contrastive learning, without document specialization.
	- `ModernVBERT` is the base model after modality alignment (using a MLM objective).

	## Evaluation

	![table](https://cdn-uploads.huggingface.co/production/uploads/661e945eebe3616a1b09e279/NLB0bdE8tAAWXnCK6vjjS.png)

	ColModernVBERT matches the performance of models nearly 10x larger on visual document benchmarks. Additionally, it provides an interesting inference speed on CPU compared to the models of similar performance.

	## License

	We release the ModernVBERT model architectures, model weights, and training codebase under the MIT license.

	## Citation

	If you use ModernVBERT in your work, please cite:

	```
	@misc{teiletche2025modernvbertsmallervisualdocument,
	title={ModernVBERT: Towards Smaller Visual Document Retrievers},
	author={Paul Teiletche and Quentin Macé and Max Conti and Antonio Loison and Gautier Viaud and Pierre Colombo and Manuel Faysse},
	year={2025},
	eprint={2510.01149},
	archivePrefix={arXiv},
	primaryClass={cs.IR},
	url={https://arxiv.org/abs/2510.01149},
	}
	```
	[More Information Needed]