OGBench Checkpoints
Scene-Play-v0 (Manipulation)
Best result: 48% GSC+Resampling+Pruning (50 episodes, seed=0)
Checkpoints
| Component | File | Training Steps | Notes |
|---|---|---|---|
| Planner (original) | scene-play/planner/state_1495000.pt |
1.5M | energy_based_compdfu, batch=170, 8 GPUs |
| Planner (ogb_v1) | scene-play/planner/ogb_v1_state_1495000.pt |
1.5M | Re-trained, batch=128, 4 GPUs. Better: 52% with same invdyn |
| InvDyn | scene-play/invdyn/state_1600000.pt |
1.6M | invdyn_scene_h150, batch=32, horizon=150, uniform goal sampling |
Eval Configs (50 episodes, seed=0)
| Config | Overall | T1 (open) | T2 (unlock) | T3 (rearrange) | T4 (drawer) | T5 (hard) |
|---|---|---|---|---|---|---|
| GSC | 36% | 70% | 40% | 50% | 10% | 10% |
| GSC+Resampling (U=10,min=10) | 40% | 70% | 20% | 60% | 50% | 0% |
| GSC+Resamp+Pruning | 48% | 80% | 50% | 70% | 30% | 10% |
Reproduction Commands
# GSC (baseline)
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 --master_port=29590 \
-m src.compdiffuser.eval_sceneplay --env scene-play-v0 \
--planner_name energy_based_compdfu --planner_epoch 1495000 \
--invdyn_name invdyn_scene_h150 --invdyn_epoch 1600000 \
--n_trials_per_task 10 --seed 0 \
--ev_cp_infer_t_type gsc --ddim_steps 50 --cond_w 2.0 \
--b_size_per_prob 40 --n_max_steps 1500
# GSC + Resampling (uniform U=10)
# Add: --ev_cp_infer_t_type gsc_resampling --num_resampling_steps 10 --min_resampling_steps 10
# GSC + Resampling + Pruning (best)
# Add: --ev_cp_infer_t_type gsc_resampling_pruning --num_resampling_steps 10 --min_resampling_steps 10 \
# --pruning_start 0.5 --cv_threshold 0.01 --undo_eta 0.5 --use_gradient_ovlp --pruning_score_type inversion
Critical Training Notes
- InvDyn batch_size=32 is essential. Batch=1024 gives 0-8%. The original invdyn_scene_h150 used batch=32.
- InvDyn horizon=150 enables multi-step pick-place. Horizon=12 (default) gives 0%.
- goal_sel_idxs must match plan_obs_select_dim:
12 13 14 19 20 21 26 27 28 29 32 33 36 38 - Planner ogb_v1 (re-trained) gets 52% with same invdyn — better than original 46%.
Cube-Single-Play-v0
Best result: 28% GSC+Resampling (50 episodes, planner at 1.5M)
| Component | File | Training Steps |
|---|---|---|
| Planner | cube-single/planner/state_1495000.pt |
1.5M |
| InvDyn | cube-single/invdyn/state_1800000.pt |
1.8M |
Cube-single invdyn was trained with batch=1024 (needs retraining with batch=32 for better results).