how to configure pi0_base to train with single camera dataset
Hi,
I'm trying to train pi0_base with "lerobot/aloha_sim_transfer_cube_human" dataset which has only one camera input "observation.images.top". However, pi0 seems to expect three camera inputs:
"observation.images.base_0_rgb",
"observation.images.left_wrist_0_rgb",
"observation.images.right_wrist_0_rgb"
"ValueError: All image features are missing from the batch. At least one expected. (batch: dict_keys(['action', 'next.reward', 'next.done', 'next.truncated', 'info', 'action_is_pad', 'task', 'index', 'task_index', 'observation.images.top', 'observation.state', 'observation.language.tokens', 'observation.language.attention_mask'])) (image_features: {'observation.images.base_0_rgb': PolicyFeature(type=<FeatureType.VISUAL: 'VISUAL'>, shape=(3, 224, 224)), 'observation.images.left_wrist_0_rgb': PolicyFeature(type=<FeatureType.VISUAL: 'VISUAL'>, shape=(3, 224, 224)), 'observation.images.right_wrist_0_rgb': PolicyFeature(type=<FeatureType.VISUAL: 'VISUAL'>, shape=(3, 224, 224))}) Exception in thread Thread-2 (_pin_memory_loop): Traceback (most recent call last): File "/root/.local/share/mamba/envs/lerobot/lib/python3.10/threading.py", line 1016, in _bootstrap_inner"
Is there a command-line argument I can use to set the single camera input to train with the pi0_base model?
Try change the input features in config.json
Try change the input features in config.json
Hi i'm curious the input features in config.json are the ones expected by the model or the ones that are actually fed into the model?
Expected. You can miss part of image features but not missing all of them. The input features defined in meta/info.json in your dataset can override your model input features config if the input features in your model config is none. But you are using pi0_base which defined input features so it just use the default input features in pi0_base model config.
Expected. You can miss part of image features but not missing all of them. The input features defined in meta/info.json in your dataset can override your model input features config if the input features in your model config is none. But you are using pi0_base which defined input features so it just use the default input features in pi0_base model config.
pi0_base expects 3 cameras in its config.json. What if I feed it with a single camera dataset? How does the model handle the case?
Use images padded with value -1 and masks padded with value 0 to match the number of image features. For example, model needs 3 cameras but only 1 provided, then it will create 2 blank images.