VAMOS: A Hierarchical Vision-Language-Action Model for Capab
Collection
This collection contains VLM planner checkpoints, affordance module checkpoints for Spot and HOUND, training datasets, and a demo • 7 items • Updated
• 3
This model is a merged LoRA fine-tuned version of google/paligemma2-3b-pt-224 on the mateoguaman/VAMOS_dataset dataset. It has been trained using TRL.
Coming Soon
This model is a fine-tuned derivative of google/paligemma2-3b-pt-224,
subject to the Gemma Terms of Use.
The training data includes content under CC BY-NC 4.0, so this model and its outputs are provided for non-commercial use only.
Please see the accompanying LICENSE and NOTICE files for full details.
Base model
google/paligemma2-3b-pt-224