Upload folder using huggingface_hub

Browse files

Files changed (5) hide show

README.md +181 -0
evaluate_pi05.py +122 -0
so101_config.py +117 -0
so101_policy.py +109 -0
test_config_local.py +275 -0

README.md ADDED Viewed

	@@ -0,0 +1,181 @@

+# Pi0.5 Fine-tuning for SO-101
+Fine-tune Physical Intelligence's Pi0.5 on the SO-101 ball-in-cup task.
+## Overview
+| Item | Value |
+|------|-------|
+| **Base Model** | Pi0.5 (`gs://openpi-assets/checkpoints/pi05_base`) |
+| **Dataset** | `abdul004/so101_ball_in_cup_v5` (72 episodes) |
+| **GPU Required** | A100 80GB (~$1.50/hr on Vast.ai) |
+| **Training Time** | ~2-3 hours for 5K steps |
+## Files in This Directory
+```
+pi0_so101/
+├── README.md           # This file
+├── so101_policy.py     # Input/output transforms (copy to openpi/src/openpi/policies/)
+└── so101_config.py     # Config template (add to openpi/src/openpi/training/config.py)
+```
+## Step-by-Step Setup on Vast.ai
+### 1. Rent GPU Instance
+On [Vast.ai](https://vast.ai), search for:
+- **GPU:** A100 80GB or H100
+- **Disk:** 100GB+
+- **Image:** Any with CUDA (PyTorch image works)
+### 2. SSH and Clone OpenPi
+```bash
+# Clone with submodules
+git clone --recurse-submodules https://github.com/Physical-Intelligence/openpi.git
+cd openpi
+# Install uv package manager
+curl -LsSf https://astral.sh/uv/install.sh | sh
+source $HOME/.local/bin/env
+# Install dependencies
+GIT_LFS_SKIP_SMUDGE=1 uv sync
+# Login to HuggingFace (for dataset access)
+huggingface-cli login
+```
+### 3. Add SO-101 Config
+```bash
+# Copy policy file
+# (upload so101_policy.py from your local machine, or create it)
+cp /path/to/so101_policy.py src/openpi/policies/so101_policy.py
+```
+Then edit `src/openpi/training/config.py`:
+**Add import at top:**
+```python
+import openpi.policies.so101_policy as so101_policy
+```
+**Add DataConfig class** (after `LeRobotLiberoDataConfig`):
+```python
+@dataclasses.dataclass(frozen=True)
+class LeRobotSO101DataConfig(DataConfigFactory):
+    @override
+    def create(self, assets_dirs: pathlib.Path, model_config: _model.BaseModelConfig) -> DataConfig:
+        repack_transform = _transforms.Group(
+            inputs=[
+                _transforms.RepackTransform({
+                    "observation/images/overhead": "observation.images.overhead",
+                    "observation/images/wrist": "observation.images.wrist",
+                    "observation/state": "observation.state",
+                    "action": "action",
+                    "prompt": "prompt",
+                })
+            ]
+        )
+        data_transforms = _transforms.Group(
+            inputs=[so101_policy.SO101Inputs(
+                action_dim=model_config.action_dim,
+                model_type=model_config.model_type
+            )],
+            outputs=[so101_policy.SO101Outputs()],
+        )
+        # Delta mask: 5 joints = delta, gripper = absolute
+        delta_action_mask = _transforms.make_bool_mask(5, -1)
+        data_transforms = data_transforms.push(
+            inputs=[_transforms.DeltaActions(delta_action_mask)],
+            outputs=[_transforms.AbsoluteActions(delta_action_mask)],
+        )
+        model_transforms = ModelTransformFactory()(model_config)
+        return dataclasses.replace(
+            self.create_base_config(assets_dirs, model_config),
+            repack_transforms=repack_transform,
+            data_transforms=data_transforms,
+            model_transforms=model_transforms,
+            action_sequence_keys=("action",),
+        )
+```
+**Add TrainConfig** to `_CONFIGS` list:
+```python
+TrainConfig(
+    name="pi05_so101",
+    model=pi0_config.Pi0Config(pi05=True, action_horizon=15),
+    data=LeRobotSO101DataConfig(
+        repo_id="abdul004/so101_ball_in_cup_v5",
+        base_config=DataConfig(prompt_from_task=True),
+    ),
+    weight_loader=weight_loaders.CheckpointWeightLoader(
+        "gs://openpi-assets/checkpoints/pi05_base/params"
+    ),
+    num_train_steps=5_000,
+    batch_size=32,
+),
+```
+### 4. Compute Normalization Stats
+```bash
+uv run scripts/compute_norm_stats.py --config-name pi05_so101
+```
+### 5. Train
+```bash
+XLA_PYTHON_CLIENT_MEM_FRACTION=0.9 uv run scripts/train.py pi05_so101 --exp-name=ball_in_cup
+```
+Training progress will be logged to console and Weights & Biases.
+### 6. Download Checkpoint
+After training, checkpoints are saved to `checkpoints/pi05_so101/ball_in_cup/`.
+Download to your local machine:
+```bash
+# On your local machine
+scp -r vast_instance:openpi/checkpoints/pi05_so101/ball_in_cup/5000 ./pi05_so101_checkpoint
+```
+## Inference on Robot
+(Coming soon - need to adapt LeRobot inference script)
+## Key Adaptations from LeKiwi
+| Aspect | LeKiwi | SO-101 |
+|--------|--------|--------|
+| Action dim | 9 | 6 |
+| Cameras | 3 (top, wrist, front) | 2 (overhead, wrist) |
+| Camera keys | `observation.images.top` | `observation.images.overhead` |
+| Delta mask | `make_bool_mask(5, -4)` | `make_bool_mask(5, -1)` |
+## Troubleshooting
+### Out of Memory
+Set memory fraction higher:
+```bash
+XLA_PYTHON_CLIENT_MEM_FRACTION=0.95 uv run scripts/train.py ...
+```
+### Dataset Not Found
+Make sure you're logged into HuggingFace:
+```bash
+huggingface-cli login
+```
+### Missing Norm Stats
+Run compute_norm_stats.py before training:
+```bash
+uv run scripts/compute_norm_stats.py --config-name pi05_so101
+```

evaluate_pi05.py ADDED Viewed

	@@ -0,0 +1,122 @@

+#!/usr/bin/env python3
+"""
+Pi0.5 Inference for SO-101 Robot
+Adapted from Ilia Larchenko's LeKiwi evaluation script
+Usage:
+    python evaluate_pi05.py --checkpoint checkpoints/pi05_so101/params
+"""
+import argparse
+import time
+from pathlib import Path
+import numpy as np
+def run_inference(checkpoint_path: str, robot_type: str = "so101"):
+    """Run Pi0 inference on SO-101 robot."""
+    # Import OpenPi (only when running inference)
+    from openpi.models import model as _model
+    from openpi.policies import policy_config
+    # Import LeRobot for robot control
+    from lerobot.common.robot_devices.robots.so101 import SO101Robot
+    print(f"Loading checkpoint from: {checkpoint_path}")
+    # Load the fine-tuned Pi0/Pi0.5 model
+    # This will auto-detect if it's Pi0 or Pi0.5 based on checkpoint
+    policy = policy_config.create_trained_policy(checkpoint_path)
+    # Connect to robot
+    print("Connecting to SO-101 robot...")
+    robot = SO101Robot()
+    robot.connect()
+    # Inference parameters
+    FPS = 30  # Match training FPS
+    ACTIONS_TO_EXECUTE = 15  # Execute fewer than predicted for better precision
+    TASK_PROMPT = "pick up the orange ball and put it in the pink cup"
+    print(f"Task: {TASK_PROMPT}")
+    print(f"FPS: {FPS}, Actions per chunk: {ACTIONS_TO_EXECUTE}")
+    print("Starting inference loop... Press Ctrl+C to stop")
+    try:
+        action_queue = []
+        step = 0
+        while True:
+            loop_start = time.perf_counter()
+            # Get current observation from robot
+            observation = robot.get_observation()
+            # If action queue is empty, get new predictions
+            if len(action_queue) == 0:
+                # Prepare observation for Pi0
+                obs_dict = {
+                    "observation/state": observation["state"],
+                    "observation/images/overhead": observation["images"]["overhead"],
+                    "observation/images/wrist": observation["images"]["wrist"],
+                    "prompt": TASK_PROMPT,
+                }
+                # Run inference
+                inference_start = time.perf_counter()
+                predicted_actions = policy.infer(obs_dict)["actions"]
+                inference_time = time.perf_counter() - inference_start
+                # Only use first N actions for better precision
+                action_queue = list(predicted_actions[:ACTIONS_TO_EXECUTE])
+                print(f"Step {step}: Inference took {inference_time*1000:.0f}ms, queued {len(action_queue)} actions")
+            # Execute next action
+            action = action_queue.pop(0)
+            robot.send_action(action)
+            step += 1
+            # Maintain FPS
+            elapsed = time.perf_counter() - loop_start
+            sleep_time = max(0, (1.0 / FPS) - elapsed)
+            time.sleep(sleep_time)
+    except KeyboardInterrupt:
+        print("\nStopping...")
+    finally:
+        robot.disconnect()
+        print("Robot disconnected")
+def main():
+    parser = argparse.ArgumentParser(description="Run Pi0/Pi0.5 on SO-101 robot")
+    parser.add_argument(
+        "--checkpoint",
+        type=str,
+        required=True,
+        help="Path to fine-tuned checkpoint (e.g., checkpoints/pi05_so101/params)"
+    )
+    parser.add_argument(
+        "--robot",
+        type=str,
+        default="so101",
+        choices=["so101"],
+        help="Robot type"
+    )
+    parser.add_argument(
+        "--prompt",
+        type=str,
+        default="pick up the orange ball and put it in the pink cup",
+        help="Task prompt for the model"
+    )
+    args = parser.parse_args()
+    run_inference(args.checkpoint, args.robot)
+if __name__ == "__main__":
+    main()

so101_config.py ADDED Viewed

	@@ -0,0 +1,117 @@

+# SO-101 Training Config for OpenPi Pi0.5
+# Adapted from Ilia Larchenko's LeKiwi config
+#
+# HOW TO USE:
+# 1. Copy so101_policy.py to openpi/src/openpi/policies/
+# 2. Add the imports and config class below to openpi/src/openpi/training/config.py
+# 3. Add the TrainConfig to the _CONFIGS list in config.py
+# 4. Run: uv run scripts/compute_norm_stats.py --config-name pi05_so101
+# 5. Run: XLA_PYTHON_CLIENT_MEM_FRACTION=0.9 uv run scripts/train.py pi05_so101 --exp-name=my_experiment
+# =============================================================================
+# ADD THESE IMPORTS to the top of config.py:
+# =============================================================================
+# import openpi.policies.so101_policy as so101_policy
+# =============================================================================
+# ADD THIS CLASS to config.py (after the other DataConfig classes):
+# =============================================================================
+"""
+@dataclasses.dataclass(frozen=True)
+class LeRobotSO101DataConfig(DataConfigFactory):
+    '''
+    Data config for SO-101 ball-in-cup task.
+    Dataset: abdul004/so101_ball_in_cup_v5
+    - 72 episodes of teleoperated demonstrations
+    - 6 DOF actions (5 arm joints + 1 gripper)
+    - 2 cameras (overhead + wrist)
+    '''
+    @override
+    def create(self, assets_dirs: pathlib.Path, model_config: _model.BaseModelConfig) -> DataConfig:
+        # Remap LeRobot dataset keys to OpenPi format
+        # Left side = OpenPi expected keys, Right side = LeRobot dataset keys
+        repack_transform = _transforms.Group(
+            inputs=[
+                _transforms.RepackTransform(
+                    {
+                        "observation/images/overhead": "observation.images.overhead",
+                        "observation/images/wrist": "observation.images.wrist",
+                        "observation/state": "observation.state",
+                        "action": "action",
+                        "prompt": "prompt",
+                    }
+                )
+            ]
+        )
+        # Data transforms using SO-101 policy
+        data_transforms = _transforms.Group(
+            inputs=[so101_policy.SO101Inputs(
+                action_dim=model_config.action_dim,
+                model_type=model_config.model_type
+            )],
+            outputs=[so101_policy.SO101Outputs()],
+        )
+        # Delta action mask:
+        # - First 5 dimensions (arm joints): convert to delta actions
+        # - Last 1 dimension (gripper): keep absolute
+        # make_bool_mask(5, -1) = [True, True, True, True, True, False]
+        delta_action_mask = _transforms.make_bool_mask(5, -1)
+        data_transforms = data_transforms.push(
+            inputs=[_transforms.DeltaActions(delta_action_mask)],
+            outputs=[_transforms.AbsoluteActions(delta_action_mask)],
+        )
+        # Model transforms (tokenization, etc.) - standard, no changes needed
+        model_transforms = ModelTransformFactory()(model_config)
+        return dataclasses.replace(
+            self.create_base_config(assets_dirs, model_config),
+            repack_transforms=repack_transform,
+            data_transforms=data_transforms,
+            model_transforms=model_transforms,
+            action_sequence_keys=("action",),  # LeRobot uses "action" not "actions"
+        )
+"""
+# =============================================================================
+# ADD THIS TrainConfig to the _CONFIGS list in config.py:
+# =============================================================================
+"""
+    TrainConfig(
+        name="pi05_so101",
+        model=pi0_config.Pi0Config(
+            pi05=True,
+            action_horizon=15,  # Shorter horizon for Pi0.5
+        ),
+        data=LeRobotSO101DataConfig(
+            repo_id="abdul004/so101_ball_in_cup_v5",
+            base_config=DataConfig(prompt_from_task=True),
+        ),
+        weight_loader=weight_loaders.CheckpointWeightLoader(
+            "gs://openpi-assets/checkpoints/pi05_base/params"
+        ),
+        num_train_steps=5_000,  # Ilia found 5K sufficient for simple tasks
+        batch_size=32,
+    ),
+"""
+# =============================================================================
+# FULL EXAMPLE: What config.py changes look like
+# =============================================================================
+# Near the top of config.py, add:
+# import openpi.policies.so101_policy as so101_policy
+# After LeRobotLiberoDataConfig class, add the LeRobotSO101DataConfig class above
+# In the _CONFIGS list, add the TrainConfig above
+# Then run:
+# uv run scripts/compute_norm_stats.py --config-name pi05_so101
+# XLA_PYTHON_CLIENT_MEM_FRACTION=0.9 uv run scripts/train.py pi05_so101 --exp-name=ball_in_cup

so101_policy.py ADDED Viewed

	@@ -0,0 +1,109 @@

+# SO-101 Policy transforms for OpenPi Pi0.5
+# Adapted from Ilia Larchenko's LeKiwi implementation
+# https://github.com/IliaLarchenko/lerobot_random/blob/main/vla/pi/lekiwi_policy.py
+#
+# Copy this file to: openpi/src/openpi/policies/so101_policy.py
+import dataclasses
+import einops
+import numpy as np
+from openpi import transforms
+from openpi.models import model as _model
+# SO-101 has 6 DOF: 5 arm joints + 1 gripper
+SO101_ACTION_DIM = 6
+def make_so101_example() -> dict:
+    """Creates a random input example for testing SO-101 policy."""
+    return {
+        "observation/state": np.random.rand(SO101_ACTION_DIM).astype(np.float32),
+        "observation/images/overhead": np.random.randint(256, size=(480, 640, 3), dtype=np.uint8),
+        "observation/images/wrist": np.random.randint(256, size=(480, 640, 3), dtype=np.uint8),
+        "prompt": "pick up the orange ball and put it in the pink cup",
+    }
+def _parse_image(image) -> np.ndarray:
+    """Convert image to HWC uint8 format expected by Pi0."""
+    image = np.asarray(image)
+    # LeRobot stores as float32 CHW, convert to uint8 HWC
+    if np.issubdtype(image.dtype, np.floating):
+        image = (255 * image).astype(np.uint8)
+    if image.shape[0] == 3:
+        image = einops.rearrange(image, "c h w -> h w c")
+    return image
+@dataclasses.dataclass(frozen=True)
+class SO101Inputs(transforms.DataTransformFn):
+    """
+    Convert SO-101 observations to Pi0 model input format.
+    SO-101 has:
+    - 6 DOF state (5 arm joints + 1 gripper)
+    - 2 cameras (overhead + wrist)
+    Pi0 expects 3 camera slots, so we duplicate overhead for the third slot.
+    """
+    # Model's action dimension (SO-101 actions will be padded to this)
+    action_dim: int
+    # Model type (PI0, PI05, PI0_FAST)
+    model_type: _model.ModelType = _model.ModelType.PI0
+    def __call__(self, data: dict) -> dict:
+        # Pad state from 6 DOF to model's action_dim
+        state = transforms.pad_to_dim(data["observation/state"], self.action_dim)
+        # Parse images from SO-101's camera keys
+        overhead_image = _parse_image(data["observation/images/overhead"])
+        wrist_image = _parse_image(data["observation/images/wrist"])
+        # Map to Pi0's expected camera slots:
+        # - base_0_rgb: overhead camera (top-down view)
+        # - left_wrist_0_rgb: wrist camera
+        # - right_wrist_0_rgb: duplicate overhead (we only have 2 cameras)
+        inputs = {
+            "state": state,
+            "image": {
+                "base_0_rgb": overhead_image,
+                "left_wrist_0_rgb": wrist_image,
+                "right_wrist_0_rgb": overhead_image,  # Duplicate overhead
+            },
+            "image_mask": {
+                "base_0_rgb": np.True_,
+                "left_wrist_0_rgb": np.True_,
+                # For Pi0 (not FAST), mask the duplicated camera
+                "right_wrist_0_rgb": np.True_ if self.model_type == _model.ModelType.PI0_FAST else np.False_,
+            },
+        }
+        # Pad actions during training
+        if "action" in data:
+            actions = transforms.pad_to_dim(data["action"], self.action_dim)
+            inputs["actions"] = actions
+        # Pass language prompt to model
+        if "prompt" in data:
+            inputs["prompt"] = data["prompt"]
+        return inputs
+@dataclasses.dataclass(frozen=True)
+class SO101Outputs(transforms.DataTransformFn):
+    """
+    Convert Pi0 model outputs back to SO-101 action format.
+    Only return the first 6 actions (5 arm joints + 1 gripper),
+    discarding any padding.
+    """
+    def __call__(self, data: dict) -> dict:
+        # Return only first 6 actions for SO-101
+        return {"actions": np.asarray(data["actions"][:, :SO101_ACTION_DIM])}

test_config_local.py ADDED Viewed

	@@ -0,0 +1,275 @@

+#!/usr/bin/env python3
+"""
+Test SO-101 Pi0.5 config locally without GPU.
+This verifies:
+1. Dataset loads correctly
+2. Keys match expected format
+3. Transforms work (simulated)
+4. Shapes are correct for Pi0.5
+Run: python test_config_local.py
+"""
+import numpy as np
+from pathlib import Path
+def test_dataset_structure():
+    """Test that dataset has expected structure."""
+    print("=" * 60)
+    print("1. Testing Dataset Structure")
+    print("=" * 60)
+    # Use LeRobot's dataset loader which handles videos properly
+    import sys
+    sys.path.insert(0, "/Users/abdul/repo/lerobot")
+    from lerobot.datasets.lerobot_dataset import LeRobotDataset
+    # Load dataset (uses local cache)
+    ds = LeRobotDataset("abdul004/so101_ball_in_cup_v5")
+    sample = ds[0]  # Get first sample
+    print(f"\nDataset keys: {list(sample.keys())}")
+    print(f"Total samples: {len(ds)}")
+    # Check expected keys
+    expected_keys = [
+        "action",
+        "observation.state",
+        "observation.images.overhead",
+        "observation.images.wrist",
+        "timestamp",
+        "frame_index",
+        "episode_index",
+    ]
+    for key in expected_keys:
+        if key in sample:
+            val = sample[key]
+            if hasattr(val, 'shape'):
+                print(f"  ✅ {key}: shape={val.shape}, dtype={val.dtype}")
+            elif hasattr(val, '__len__') and not isinstance(val, (str, dict)):
+                print(f"  ✅ {key}: len={len(val)}")
+            else:
+                print(f"  ✅ {key}: {type(val).__name__}")
+        else:
+            print(f"  ❌ {key}: MISSING!")
+    return sample
+def test_image_parsing(sample):
+    """Test image format conversion."""
+    print("\n" + "=" * 60)
+    print("2. Testing Image Parsing")
+    print("=" * 60)
+    import einops
+    def _parse_image(image) -> np.ndarray:
+        """Convert image to HWC uint8 format expected by Pi0."""
+        image = np.asarray(image)
+        original_shape = image.shape
+        original_dtype = image.dtype
+        if np.issubdtype(image.dtype, np.floating):
+            image = (255 * image).astype(np.uint8)
+        if image.shape[0] == 3:
+            image = einops.rearrange(image, "c h w -> h w c")
+        print(f"  Input: shape={original_shape}, dtype={original_dtype}")
+        print(f"  Output: shape={image.shape}, dtype={image.dtype}")
+        return image
+    print("\nOverhead camera:")
+    overhead = _parse_image(sample["observation.images.overhead"])
+    print("\nWrist camera:")
+    wrist = _parse_image(sample["observation.images.wrist"])
+    # Verify final shapes
+    assert overhead.shape[2] == 3, f"Overhead should be HWC, got {overhead.shape}"
+    assert wrist.shape[2] == 3, f"Wrist should be HWC, got {wrist.shape}"
+    assert overhead.dtype == np.uint8, f"Should be uint8, got {overhead.dtype}"
+    print("\n  ✅ Images correctly converted to HWC uint8 format")
+    return overhead, wrist
+def test_state_and_action(sample):
+    """Test state and action dimensions."""
+    print("\n" + "=" * 60)
+    print("3. Testing State and Action Dimensions")
+    print("=" * 60)
+    state = np.asarray(sample["observation.state"])
+    action = np.asarray(sample["action"])
+    print(f"\n  State: shape={state.shape}, values={state}")
+    print(f"  Action: shape={action.shape}, values={action}")
+    # SO-101 should have 6 DOF
+    assert len(state) == 6, f"State should be 6 DOF, got {len(state)}"
+    assert len(action) == 6, f"Action should be 6 DOF, got {len(action)}"
+    print("\n  ✅ State and Action are 6 DOF as expected")
+    # Test padding to model action_dim (Pi0.5 uses 32 by default, but we can use smaller)
+    def pad_to_dim(arr, target_dim):
+        """Pad array to target dimension."""
+        arr = np.asarray(arr)
+        if len(arr) >= target_dim:
+            return arr[:target_dim]
+        return np.pad(arr, (0, target_dim - len(arr)), mode='constant')
+    model_action_dim = 32  # Pi0.5 default
+    padded_state = pad_to_dim(state, model_action_dim)
+    padded_action = pad_to_dim(action, model_action_dim)
+    print(f"\n  Padded state: shape={padded_state.shape}")
+    print(f"  Padded action: shape={padded_action.shape}")
+    print(f"  ✅ Padding to model action_dim={model_action_dim} works")
+    return state, action
+def test_delta_transform(state, action):
+    """Test delta action transformation."""
+    print("\n" + "=" * 60)
+    print("4. Testing Delta Action Transform")
+    print("=" * 60)
+    # Delta mask: first 5 joints = delta, gripper = absolute
+    # make_bool_mask(5, -1) = [True, True, True, True, True, False]
+    delta_mask = [True, True, True, True, True, False]
+    print(f"\n  Delta mask: {delta_mask}")
+    print(f"  (5 joints use delta, gripper stays absolute)")
+    # Simulate delta transform
+    delta_action = np.zeros_like(action)
+    for i, use_delta in enumerate(delta_mask):
+        if use_delta:
+            delta_action[i] = action[i] - state[i]  # Convert to delta
+        else:
+            delta_action[i] = action[i]  # Keep absolute (gripper)
+    print(f"\n  Original action: {action}")
+    print(f"  Current state:   {state}")
+    print(f"  Delta action:    {delta_action}")
+    # Verify we can convert back
+    recovered_action = np.zeros_like(delta_action)
+    for i, use_delta in enumerate(delta_mask):
+        if use_delta:
+            recovered_action[i] = state[i] + delta_action[i]  # Delta to absolute
+        else:
+            recovered_action[i] = delta_action[i]  # Already absolute
+    np.testing.assert_array_almost_equal(action, recovered_action)
+    print(f"  Recovered:       {recovered_action}")
+    print("\n  ✅ Delta transform is reversible")
+def test_repack_transform():
+    """Test the repack transform key mapping."""
+    print("\n" + "=" * 60)
+    print("5. Testing Repack Transform (Key Mapping)")
+    print("=" * 60)
+    # This is what OpenPi's RepackTransform does
+    repack_map = {
+        "observation/images/overhead": "observation.images.overhead",
+        "observation/images/wrist": "observation.images.wrist",
+        "observation/state": "observation.state",
+        "action": "action",
+        "prompt": "prompt",
+    }
+    print("\n  LeRobot key → OpenPi key:")
+    for openpi_key, lerobot_key in repack_map.items():
+        print(f"    {lerobot_key} → {openpi_key}")
+    print("\n  ✅ Key mapping defined correctly")
+def test_pi0_input_format(overhead, wrist, state, action):
+    """Test the final Pi0 input format."""
+    print("\n" + "=" * 60)
+    print("6. Testing Pi0.5 Input Format")
+    print("=" * 60)
+    # Simulate what SO101Inputs produces
+    model_action_dim = 32
+    def pad_to_dim(arr, target_dim):
+        arr = np.asarray(arr)
+        if len(arr) >= target_dim:
+            return arr[:target_dim]
+        return np.pad(arr, (0, target_dim - len(arr)), mode='constant')
+    inputs = {
+        "state": pad_to_dim(state, model_action_dim),
+        "image": {
+            "base_0_rgb": overhead,           # Overhead → base
+            "left_wrist_0_rgb": wrist,        # Wrist → left_wrist
+            "right_wrist_0_rgb": overhead,    # Duplicate overhead
+        },
+        "image_mask": {
+            "base_0_rgb": True,
+            "left_wrist_0_rgb": True,
+            "right_wrist_0_rgb": False,  # Masked for Pi0 (not FAST)
+        },
+        "actions": pad_to_dim(action, model_action_dim),
+        "prompt": "pick up the orange ball and put it in the pink cup",
+    }
+    print("\n  Pi0.5 input structure:")
+    print(f"    state: shape={inputs['state'].shape}")
+    print(f"    image.base_0_rgb: shape={inputs['image']['base_0_rgb'].shape}")
+    print(f"    image.left_wrist_0_rgb: shape={inputs['image']['left_wrist_0_rgb'].shape}")
+    print(f"    image.right_wrist_0_rgb: shape={inputs['image']['right_wrist_0_rgb'].shape}")
+    print(f"    image_mask: {inputs['image_mask']}")
+    print(f"    actions: shape={inputs['actions'].shape}")
+    print(f"    prompt: '{inputs['prompt']}'")
+    print("\n  ✅ Pi0.5 input format is correct!")
+def main():
+    print("\n🧪 Testing SO-101 Pi0.5 Config Locally\n")
+    try:
+        # Test 1: Dataset structure
+        sample = test_dataset_structure()
+        # Test 2: Image parsing
+        overhead, wrist = test_image_parsing(sample)
+        # Test 3: State and action
+        state, action = test_state_and_action(sample)
+        # Test 4: Delta transform
+        test_delta_transform(state, action)
+        # Test 5: Repack transform
+        test_repack_transform()
+        # Test 6: Final Pi0 format
+        test_pi0_input_format(overhead, wrist, state, action)
+        print("\n" + "=" * 60)
+        print("✅ ALL TESTS PASSED!")
+        print("=" * 60)
+        print("\nConfig should work on Vast.ai. Ready to train!")
+    except Exception as e:
+        print(f"\n❌ TEST FAILED: {e}")
+        import traceback
+        traceback.print_exc()
+if __name__ == "__main__":
+    main()