| license: apache-2.0 | |
| base_model: lerobot/act | |
| tags: | |
| - lerobot | |
| - act | |
| - robotics | |
| - manipulation | |
| - real-robot | |
| - so101 | |
| - visuomotor | |
| datasets: | |
| - ShubhamK32/so101_declutter_v1 | |
| pipeline_tag: robotics | |
| # ACT β SO-101 Space Decluttering | |
| ACT (Action Chunking Transformer) policy trained on the [SO-101 Space Decluttering Dataset v1](https://huggingface.co/datasets/ShubhamK32/so101_declutter_v1) for pick-and-place decluttering tasks on a 6-DoF SO-101 robotic arm. Trained using [LeRobot](https://github.com/huggingface/lerobot). | |
| ## Training Details | |
| - **Policy:** ACT (Action Chunking Transformer) | |
| - **Steps:** 100,000 | |
| - **Robot:** SO-101 6-DoF leader-follower | |
| - **Cameras:** Dual-view β fixed top-view + wrist-mounted egocentric | |
| - **Framework:** LeRobot | |
| ## Dataset | |
| Trained on [ShubhamK32/so101_declutter_v1](https://huggingface.co/datasets/ShubhamK32/so101_declutter_v1) β a multi-view teleoperation dataset with spatial distractors injected to prevent visual shortcut learning. | |
| ## Usage | |
| ```python | |
| from lerobot.policies.act.modeling_act import ACTPolicy | |
| policy = ACTPolicy.from_pretrained("ShubhamK32/act_so101_declutter") | |
| ``` | |
| ## Camera Views | |
| - `observation.images.topview` β Fixed overhead. Better for unoccluded pick-place tasks. | |
| - `observation.images.wristview` β Egocentric wrist-mounted. Better for overlapping and cluttered scenes. | |
| ## Related | |
| - Dataset: [ShubhamK32/so101_declutter_v1](https://huggingface.co/datasets/ShubhamK32/so101_declutter_v1) | |
| - SmolVLA checkpoint: [ShubhamK32/smolvla_so101_declutter](https://huggingface.co/ShubhamK32/smolvla_so101_declutter) | |