IPEC-COMMUNITY
/

spatialvla-4b-224-pt

Image-Text-to-Text

feature-extraction

Foundation Vision-language-action Model

Generalist Robot Policy

Model card Files Files and versions

Tavish9 commited on Jan 30, 2025

Commit

cf960b5

·

verified ·

1 Parent(s): 01f5ea2

Update README.md

Files changed (1) hide show

README.md +15 -15

README.md CHANGED Viewed

@@ -1,23 +1,23 @@
----
-license: mit
-base_model:
-- google/paligemma2-3b-pt-224
-tags:
-- VLA
-- Foundation Vision-language-action Model
-- Generalist Robot Policy
-- robotics
-language:
-- en
-pipeline_tag: image-text-to-text
-library_name: transformers
----
 # SpatialVLA
 SpatialVLA is a spatial-enhanced vision-language-action model trained on 1.1 Million real robot episodes. The code is purely huggingFace-based and concise, with efficient performance.
-All SpatialVLA checkpoints, as well as our [training codebase](https://github.com/openvla/openvla) are released under an MIT License.
 For full details, please read [our paper](https://arxiv.org/abs/2501.15830) and see [our project page](https://spatialvla.github.io/).

+---
+license: mit
+base_model:
+- google/paligemma2-3b-pt-224
+tags:
+- VLA
+- Foundation Vision-language-action Model
+- Generalist Robot Policy
+- robotics
+language:
+- en
+pipeline_tag: image-text-to-text
+library_name: transformers
+---
 # SpatialVLA
 SpatialVLA is a spatial-enhanced vision-language-action model trained on 1.1 Million real robot episodes. The code is purely huggingFace-based and concise, with efficient performance.
+All SpatialVLA checkpoints, as well as our [training codebase](https://github.com/SpatialVLA/SpatialVLA) are released under an MIT License.
 For full details, please read [our paper](https://arxiv.org/abs/2501.15830) and see [our project page](https://spatialvla.github.io/).