Push-F Diffusion Policy

A visuomotor diffusion policy trained to push an F-shaped block into a target orientation, adapted from the Diffusion Policy codebase (Chi et al., 2023).

Model Description

  • Architecture: Diffusion UNet with ResNet18 image encoder
  • Parameters: 278M
  • Observations: 96x96 RGB image + 2D agent position
  • Actions: 2D target position for the agent
  • Training data: 101 human demonstrations (~29,800 timesteps)
  • Training: 250 epochs on NVIDIA H100, ~3.5 hours
  • Framework: PyTorch 2.0.1

Performance

Evaluated on 50 held-out environment seeds:

Time Limit Mean Score Perfect Seeds (1.0)
30s 0.837 19/50
45s 0.945 38/50
60s 0.961 45/50
90s 1.000 50/50

Usage

git clone https://github.com/bryandong24/reu_adaptation.git
cd reu_adaptation

# Set up environment
mamba env create -f conda_environment.yaml
conda activate robodiff
pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
pip install -e .

# Download checkpoint and evaluate
python eval.py --checkpoint epoch=0250-test_mean_score=0.880.ckpt -o eval_output

Training Details

  • Loss: MSE denoising loss (DDPM)
  • Optimizer: AdamW (lr=1e-4, weight_decay=1e-6)
  • LR Schedule: Cosine with 500-step warmup
  • Batch size: 64
  • Horizon: 16 steps (n_obs=2, n_action=8)
  • Diffusion steps: 100 (training), 100 (inference)
  • EMA: Enabled

Citation

Based on:

@inproceedings{chi2023diffusionpolicy,
    title={Diffusion Policy: Visuomotor Policy Learning via Action Diffusion},
    author={Chi, Cheng and Feng, Siyuan and Du, Yilun and Xu, Zhenjia and Cousineau, Eric and Burchfiel, Benjamin and Song, Shuran},
    booktitle={Proceedings of Robotics: Science and Systems (RSS)},
    year={2023}
}

Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading