# OpenVLA-OFT -- color_object Checkpoint Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success. Paper: https://arxiv.org/abs/2502.19645 Project: https://openvla-oft.github.io/ ## Repository Structure ``` checkpoints/ color_object/ model-0000{1..4}-of-00004.safetensors # merged LLM weights (step 50000) action_head--50000_checkpoint.pt # MLP action head proprio_projector--50000_checkpoint.pt # proprio projector config.json / tokenizer* / ... # model config and tokenizer files lora_adapter/ adapter_model.safetensors # LoRA adapter weights adapter_config.json prismatic/ # model architecture, dataset, training code vla-scripts/ # finetune.py, deploy.py, merge_lora_weights_and_save.py experiments/ # eval scripts for LIBERO, ALOHA slurm_scripts/ # SLURM finetune scripts for all conflict splits finetune_color_object.sh # exact script used to produce the checkpoint finetune.md # step-by-step fine-tuning guide SETUP.md / LIBERO.md / ALOHA.md ``` ## Quick Inference See `finetune.md` for the full loading example. ```python from experiments.robot.openvla_utils import get_vla, get_processor, get_action_head, get_proprio_projector, get_vla_action from experiments.robot.libero.run_libero_eval import GenerateConfig from prismatic.vla.constants import NUM_ACTIONS_CHUNK, PROPRIO_DIM cfg = GenerateConfig( pretrained_checkpoint="checkpoints/color_object", use_l1_regression=True, use_film=False, num_images_in_input=2, use_proprio=True, center_crop=True, num_open_loop_steps=NUM_ACTIONS_CHUNK, unnorm_key="conflict_maniskill", ) vla = get_vla(cfg) processor = get_processor(cfg) action_head = get_action_head(cfg, llm_dim=vla.llm_dim) proprio_projector = get_proprio_projector(cfg, llm_dim=vla.llm_dim, proprio_dim=PROPRIO_DIM) actions = get_vla_action(cfg, vla, processor, observation, observation["task_description"], action_head, proprio_projector) ``` ## Fine-tuning See `finetune.md` for the complete fine-tuning guide. ## Citation ```bibtex @article{kim2025openvlaoft, title = {Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success}, author = {Kim, Moo Jin and Pertsch, Karl and Ghosh, Dibya and Walke, Homer and Bahl, Shikhar and Levine, Sergey and Finn, Chelsea}, journal = {arXiv preprint arXiv:2502.19645}, year = {2025} } ```