IndoColSmol
Collection
Lightweight Vision-Language Models for Multimodal Indonesian Document Search • 2 items • Updated
How to use ingenio/IndoColSmol-500M with Transformers:
# Load model directly
from transformers import AutoProcessor, ColIdefics3
processor = AutoProcessor.from_pretrained("ingenio/IndoColSmol-500M")
model = ColIdefics3.from_pretrained("ingenio/IndoColSmol-500M")How to use ingenio/IndoColSmol-500M with ColPali:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
This model is a fine-tuned version of vidore/ColSmolVLM-Instruct-500M-base on the ingenio/indodvqa_dataset dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| No log | 0.0099 | 1 | 0.4474 |
| 0.4523 | 0.3960 | 40 | 0.4055 |
| 0.3996 | 0.7921 | 80 | 0.3804 |
| 0.3637 | 1.1881 | 120 | 0.3687 |
| 0.345 | 1.5842 | 160 | 0.3627 |
| 0.3466 | 1.9802 | 200 | 0.3630 |
Base model
HuggingFaceTB/SmolLM2-360M