xintaozhen
/

MiniVLA

Image-Text-to-Text

vision-language-action

edge-deployment

Model card Files Files and versions

xintaozhen commited on Sep 8, 2025

Commit

89266fc

·

verified ·

1 Parent(s): 3ea13b0

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -11,7 +11,7 @@ base_model: Stanford-ILIAD/minivla-vq-libero90-prismatic
 library_name: transformers
 datasets:
 - LIBERO
-pipeline_tag: text-to-action
 ---
 # MiniVLA
@@ -122,7 +122,8 @@ generated_actions = response.json()
 - Exported the **vision encoder** to TensorRT, reducing perception latency and GPU memory usage.
 - Integrated **Qwen 2.5 0.5B** in Hugging Face and TensorRT-LLM formats.
 - Designed a **modular system architecture** with router & fallback for robustness.
-- Demonstrated efficient **edge-side VLA inference** on Jetson Orin Nano in LIBERO tasks.
 ---

 library_name: transformers
 datasets:
 - LIBERO
+pipeline_tag: image-text-to-text
 ---
 # MiniVLA
 - Exported the **vision encoder** to TensorRT, reducing perception latency and GPU memory usage.
 - Integrated **Qwen 2.5 0.5B** in Hugging Face and TensorRT-LLM formats.
 - Designed a **modular system architecture** with router & fallback for robustness.
+- Demonstrated efficient **edge-side VLA inference**
+ on Jetson Orin Nano in LIBERO tasks.
 ---