mateoguaman
/

vamos

image-text-to-text

text-generation-inference

Model card Files Files and versions

Model Card for VAMOS

This model is a merged LoRA fine-tuned version of google/paligemma2-3b-pt-224 on the mateoguaman/VAMOS_dataset dataset. It has been trained using TRL.

Quick start

Coming Soon

Framework versions

TRL: 0.15.2
Transformers: 4.49.0
Pytorch: 2.6.0
Datasets: 3.4.1
Tokenizers: 0.21.1

License and Usage

This model is a fine-tuned derivative of google/paligemma2-3b-pt-224,
subject to the Gemma Terms of Use.

The training data includes content under CC BY-NC 4.0, so this model and its outputs are provided for non-commercial use only.

Please see the accompanying LICENSE and NOTICE files for full details.

Downloads last month: 180

Safetensors

Model size

3B params

Tensor type

F16

·

Video Preview

loading

Model tree for mateoguaman/vamos

Base model

google/paligemma2-3b-pt-224

Finetuned

(114)

this model

Dataset used to train mateoguaman/vamos

Collection including mateoguaman/vamos

VAMOS: A Hierarchical Vision-Language-Action Model for Capab

This collection contains VLM planner checkpoints, affordance module checkpoints for Spot and HOUND, training datasets, and a demo • 7 items • Updated Oct 27, 2025 • 3