|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- robotics |
|
|
- imitation-learning |
|
|
- diffusion-policy |
|
|
- manipulation |
|
|
- hiro-robot |
|
|
- lerobot |
|
|
- goal-conditioned |
|
|
- sensor-diffusion |
|
|
datasets: |
|
|
- roboset_20260112_225816 |
|
|
- roboset_20260113_001336 |
|
|
--- |
|
|
|
|
|
# Proximity Sensor Goal-Conditioned Diffusion Policy |
|
|
|
|
|
## Model Description |
|
|
|
|
|
A goal-conditioned Diffusion Policy trained on sensor datasets. The model predicts joint positions (next positions along trajectory) conditioned on the current observation (joint positions, table camera image) and a goal cartesian position. |
|
|
|
|
|
## Model Architecture |
|
|
|
|
|
- **Policy Type**: Diffusion Policy |
|
|
- **Framework**: LeRobot |
|
|
- **Horizon**: 16 steps |
|
|
- **Observation Steps**: 1 step (single timestep) |
|
|
|
|
|
## Inputs |
|
|
|
|
|
- **`observation.state`**: Shape `(batch, 1, 7)` - Joint positions (7 DOF arm) |
|
|
- **`observation.goal`**: Shape `(batch, 1, 3)` - Goal cartesian position (X, Y, Z) |
|
|
- **`observation.images.table_camera`**: Shape `(batch, 1, 3, 480, 640)` - Table camera RGB images |
|
|
|
|
|
## Outputs |
|
|
|
|
|
- **`action`**: Shape `(batch, 16, 7)` - Joint positions (7 DOF) for 16-step horizon (next positions along trajectory) |
|
|
|
|
|
**Note**: The model outputs a full 16-step horizon. Use `select_action()` to get the first step `(batch, 7)`, or `predict_action_chunk()` to get the full horizon `(batch, 16, 7)`. |
|
|
|
|
|
## Normalization |
|
|
|
|
|
### Input Normalization |
|
|
|
|
|
**Images** (`observation.images.table_camera`): |
|
|
- Normalize from `[0, 255]` to `[0, 1]` by dividing by `255.0` |
|
|
- Then apply mean-std normalization using dataset statistics (handled by preprocessor) |
|
|
|
|
|
**State** (`observation.state`): |
|
|
- Apply min-max normalization: `(state - min) / (max - min)` using dataset statistics (handled by preprocessor) |
|
|
|
|
|
**Goal** (`observation.goal`): |
|
|
- Apply min-max normalization: `(goal - min) / (max - min)` using dataset statistics (handled by preprocessor) |
|
|
|
|
|
### Output Unnormalization |
|
|
|
|
|
**Actions** (`action`): |
|
|
- Apply inverse min-max normalization: `action * (max - min) + min` using dataset statistics (handled by postprocessor) |
|
|
- **Note**: Actions are joint positions (not velocities) - these are the next positions the robot should move to along the trajectory |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from lerobot.policies.diffusion.modeling_diffusion import DiffusionPolicy |
|
|
from lerobot.policies.factory import make_pre_post_processors |
|
|
|
|
|
# Load model |
|
|
policy = DiffusionPolicy.from_pretrained("calebescobedo/sensor-diffusion-policy-v1") |
|
|
|
|
|
# Load preprocessor and postprocessor from the same repo |
|
|
preprocessor, postprocessor = make_pre_post_processors( |
|
|
policy_cfg=policy.config, |
|
|
pretrained_path="calebescobedo/sensor-diffusion-policy-v1" |
|
|
) |
|
|
|
|
|
# Prepare inputs |
|
|
batch = { |
|
|
'observation.state': state_tensor, # (batch, 1, 7) - raw joint positions |
|
|
'observation.goal': goal_tensor, # (batch, 1, 3) - raw goal xyz |
|
|
'observation.images.table_camera': table_img, # (batch, 1, 3, 480, 640) - uint8 [0,255] or float [0,1] |
|
|
} |
|
|
|
|
|
# Inference |
|
|
policy.eval() |
|
|
with torch.no_grad(): |
|
|
batch = preprocessor(batch) # Normalizes inputs |
|
|
actions = policy.select_action(batch) # Returns normalized actions |
|
|
actions = postprocessor(actions) # Unnormalizes to raw joint positions |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Training**: Epoch-based (ensures all trajectories seen) |
|
|
- **Batch Size**: 64 |
|
|
- **Optimizer**: Adam |
|
|
- **Mixed Precision**: Enabled (AMP) |
|
|
- **Data Loading**: Optimized with persistent file handles |
|
|
- **Datasets**: |
|
|
- roboset_20260112_225816.h5 (20 trajectories) |
|
|
- roboset_20260113_001336.h5 (50 trajectories) |
|
|
|
|
|
## Dataset Notes |
|
|
|
|
|
- All trajectories have the same start and end positions |
|
|
- Single demonstration repeated 70 times |
|
|
- Goal: Final cartesian position `[0.454, -0.133, 0.522]` (constant) |
|
|
|
|
|
## License |
|
|
|
|
|
MIT License |
|
|
|