YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

OpenVLA Fine-tuned Model - LAMP Search Dataset

Model Information

  • Base Model: openvla/openvla-7b
  • Dataset: lampe_search_dataset/all
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Training Configuration:
    • Batch Size: 4 (with gradient accumulation: 4)
    • Effective Batch Size: 16
    • Learning Rate: 5e-4
    • Max Steps: 3000
    • LoRA Rank: 32
    • LoRA Dropout: 0.0

Dataset

This model was fine-tuned on the LAMP Search dataset:

  • Action Space: 4-DoF (Base, Joint2, Joint3, Joint4)
  • Action Type: Absolute joint positions (setpoints)
  • Task: Search for a person/lamp in the room by scanning

Usage

from transformers import AutoModelForVision2Seq, AutoProcessor
from PIL import Image
import torch

# Load model and processor
processor = AutoProcessor.from_pretrained("kavinrajkrupsurge/lampe-sim-data-openvla", trust_remote_code=True)
model = AutoModelForVision2Seq.from_pretrained(
    "kavinrajkrupsurge/lampe-sim-data-openvla",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True
).to("cuda")

# Prepare inputs
instruction = "search for a person in the room by scanning the room and stop when you find"
image = Image.open("path/to/image.jpg")
prompt = f"In: What action should the robot take to {instruction.lower()}?\nOut:"

inputs = processor(prompt, image).to("cuda", dtype=torch.bfloat16)

# Predict action
with torch.inference_mode():
    action = model.predict_action(
        **inputs,
        unnorm_key="lampe_search_dataset/all",
        do_sample=False
    )

Files Included

  • model.safetensors - Merged model weights
  • dataset_statistics.json - Dataset statistics for action un-normalization
  • config.json - Model configuration
  • All processor/tokenizer files
  • training_scripts/ - All scripts used for dataset conversion and fine-tuning

Training Scripts

This repository includes all scripts necessary to reproduce the training:

Dataset Conversion

  • training_scripts/dataset_conversion/convert_lampe_search.py - Convert raw dataset to RLDS format
  • training_scripts/dataset_conversion/lampe_search_dataset.py - RLDS dataset builder
  • training_scripts/update_lampe_search_instructions.py - Update metadata with instructions

Fine-tuning

  • training_scripts/finetune.py - Main fine-tuning script (modified OpenVLA finetune.py)
  • training_scripts/train_lampe_search.sh - Training command script
  • training_scripts/convert_and_finetune_lampe_search.sh - Complete pipeline script

Dataset Configuration

  • dataset_configs/configs.py - Dataset configuration mappings
  • dataset_configs/transforms_note.txt - Dataset transform functions reference

Reproducing Training

  1. Prepare Dataset: Follow the dataset conversion scripts in training_scripts/dataset_conversion/
  2. Run Fine-tuning: Use training_scripts/train_lampe_search.sh or the Python scripts
  3. Check Configuration: See dataset_configs/ for dataset-specific settings

Notes

  • The model uses absolute joint positions (not deltas)
  • Action normalization is handled automatically using dataset_statistics.json
  • Use unnorm_key="lampe_search_dataset/all" when calling predict_action()
  • All training was done with LoRA (Low-Rank Adaptation) for efficient fine-tuning
Downloads last month
-
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support