YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

SmolVLA-OMY Model Checkpoints

This repository contains training checkpoints for a SmolVLA (Small Vision-Language-Action) model trained on the ArrangeVegetables task.

Model Details

  • Model Type: SmolVLA (Vision-Language-Action model)
  • Task: ArrangeVegetables manipulation task
  • Training Steps: 20,000 steps
  • Batch Size: 350
  • Chunk Size: 5 action steps
  • Input Features:
    • Visual observations: 256x256 RGB images (both main camera and wrist camera)
    • State observations: 6-dimensional state vector
  • Output Features: 12-dimensional action space

Checkpoint Structure

The repository contains checkpoints saved at different training steps:

  • 000500/: Checkpoint at 500 steps
  • 001000/: Checkpoint at 1,000 steps
  • 001500/: Checkpoint at 1,500 steps
  • 002000/: Checkpoint at 2,000 steps

Each checkpoint contains:

  • pretrained_model/: Model weights and configuration
  • training_state/: Optimizer state, scheduler state, and training metadata

Training Configuration

  • Device: CUDA
  • Seed: 42
  • Workers: 24
  • Evaluation Frequency: Every 5 steps
  • Logging Frequency: Every step
  • Image Resize: 512x512 with padding
  • Normalization: Identity for visual, mean-std for state/action

Usage

To load a checkpoint:

from your_training_framework import load_checkpoint

# Load the latest checkpoint (2000 steps)
model = load_checkpoint("./002000/pretrained_model/")

Dataset

Trained on the ArrangeVegetables dataset available at: lava8888/ArrangeVegetables

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support