Diffusion Policy β€” VAC Fridge (input/output ablation)

Eight robomimic Diffusion Policy checkpoints for the variable-impedance pick ("fridge") task on a Doosan M0609 + Inspire RH56 hand under a Variable Admittance Controller (VAC). Trained on aleleanza/vac-fridge-single-cam (200 episodes, 54,833 frames @ 20 Hz, single RGB camera).

This is the fridge-task counterpart of aleleanza/diffusion-policy-vac-pipe: an ablation over two output representations (relative to the admittance filter) Γ— input modality, plus a larger-UNet variant per side.

Variants (subfolders)

Subfolder Output (action) Inputs (obs) Action dim Best val loss @epoch
vac_preimg pre: user_cmd[6] + K + ΞΆ + hand_binary vision 9 0.11807 240
vac_preimg_state pre vision + state 9 0.09940 240
vac_preimg_state_wrench pre vision + state + wrench 9 0.10013 597
vac_preimg_state_wrench_big pre (larger UNet) vision + state + wrench 9 0.10529 275
vac_postimg post: vel_cmd[6] + hand_binary vision 7 0.10254 240
vac_postimg_state post vision + state 7 0.09851 193
vac_postimg_state_wrench post vision + state + wrench 7 0.09467 193
vac_postimg_state_wrench_big post (larger UNet) vision + state + wrench 7 0.10136 263
  • pre = predict the operator command before admittance, including the compliance command itself (stiffness_cmd K + damping ΞΆ) β†’ the policy learns to set compliance.
  • post = predict the Cartesian velocity executed after admittance β†’ bypasses the admittance filter and drives velocity directly.

Each subfolder contains: best.pth, last.pth, config.json, action_stats.json (action min-max normalization + components + hand binarization), dataset_summary.json (train/valid split).

Architecture & training

  • Algorithm: robomimic Diffusion Policy (DDPM noise-prediction loss).
  • Backbone: conditional UNet [128, 256, 512]; the _big variants use [256, 512, 1024] (~91.8M params vs ~89.4M).
  • Horizons: observation 2, action 4, prediction 8 (frame stack 2, seq length 8).
  • Image: single camera β†’ front_rgb, 84Γ—84.
  • Diffusion: DDPM 50 train/infer steps (DDIM 10-step configurable for faster inference).
  • Schedule: up to 600 epochs, batch 16, lr 1e-4.
  • state is built at runtime from current TCP + Inspire hand joints; wrench from /bota_ft_sensor/wrench. Hand head is binarized (hand_open_binary).

Reading the results

Validation losses are tightly clustered (~0.095–0.118). Adding state helps both pre and post; the best overall is vac_postimg_state_wrench (0.0947). The larger-UNet (_big) variants did not improve validation at this dataset size. Post (vel_cmd) targets are marginally easier than pre (user_cmd + K + ΞΆ), consistent with the pipe ablation though the gap here is much smaller.

Inference (ROS 2)

Run with the project's robot_learning real-time inference nodes (Doosan M0609 + Inspire hand). The action contract determines the runner:

  • pre* variants (9D) β†’ diffusion_policy_vac_preimg_runner: publishes action[:6]β†’/delta_pose_cmd, [6]β†’/predicted_K, [7]β†’/predicted_zeta, [8]β†’/inspire_hand/left/cmd; consumed by variable_admittance_node (variable_K:=true).
  • post* variants (7D) β†’ diffusion_policy_fixed_k_runner in vel_cmd mode: streams velocity directly via the DSR speedl interface (no admittance node).

Pre-wired launchers exist under robot_learning/scripts/ (e.g. launch_vac_inference.sh preimg_state_wrench); *_wrench* variants additionally need the Bota driver. action_stats.json provides the exact normalization to undo at inference.

Intended use & limitations

  • Use: research on force-aware / compliance-predicting imitation learning; VAC baselines.
  • Limitations: single task, single embodiment (M0609 + RH56), single 84Γ—84 camera, binary hand. Validation differences between variants are small. Not validated for safety-critical or autonomous deployment.

Related

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Dataset used to train aleleanza/diffusion-policy-vac-fridge

Collection including aleleanza/diffusion-policy-vac-fridge