Robotics
Transformers
Safetensors
English
openvla
feature-extraction
vla
image-text-to-text
multimodal
pretraining
custom_code
Instructions to use openvla/openvla-7b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openvla/openvla-7b with Transformers:
# Load model directly from transformers import AutoModelForVision2Seq model = AutoModelForVision2Seq.from_pretrained("openvla/openvla-7b", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Running openvla on multiple gpus/distributed?
#1
by mmajek - opened
Hi! Thank you so much for the research and open sourcing your code.
I am currently running openvla at 6 actions per second on a single rtx 4090 on a 2x4090 machine.
I've been tinkering with the code trying to get it working with device_map = auto, with no luck.
Have I missed something?
Hmm… not sure about exactly what should happen under the hood when parallelizing. When you load with device_map=auto are you seeing the model split across GPUs/is it just not any faster?
Because our model always needs to encode the image features before generating actions there’s always an upfront cost; we also don’t support generation with a batch size > 1 right now, so this could further impact things.
skaramcheti changed discussion status to closed