Robotics
LeRobot
Safetensors
dynamicvla

Model Card for DynamicVLA

DynamicVLA is a vision-language-action model for dynamic object manipulation. It is designed to handle dynamic scenes that require fast perception, temporal anticipation, and continuous control.

This model is trained and evaluated using the official DynamicVLA codebase. For full setup, training, and benchmarking instructions, please refer to the repository README.


How to Get Started with the Model

For a complete walkthrough, see the official DynamicVLA repository. Below is the short version for training and running inference/evaluation.

Train from scratch

From the PROJECT_ROOT/dynamic-vla directory, run:

torchrun --nnodes=1 --nproc_per_node=8 --standalone run.py \
  -c configs/dynamicvla.yaml \
  -d hzxie/DOM

Evaluate the policy / run inference

# 1. start evaluation server
python3 simulations/evaluate.py \
  --scene_dir ../scenes \
  --output_dir ../output/evaluation \
  --env_cfg ../test-envs.txt \
  --enable_cameras --headless -n 20 --save

# 2. run policy inference
python3 scripts/inference.py \
  -p /path/to/vla-checkpoint \
  -r euler -d -s
Downloads last month
55
Video Preview
loading

Model tree for hzxie/dynamic-vla-DOM

Finetuned
(101)
this model

Dataset used to train hzxie/dynamic-vla-DOM

Paper for hzxie/dynamic-vla-DOM