VLAarchtests / MODEL_INDEX.md
lsnu's picture
Add files using upload-large-folder tool
de2fd70 verified

Model Index

2026-03-25/26 Additions

Handoff Proxy Checkpoints

Run Checkpoint Summary Report
spatial handoff artifacts/outputs/r3d_handoff/proxy_interaction_r3d_stage3_clip_rgbd_handoff_spatial_seed17/checkpoint_best.pt artifacts/outputs/r3d_handoff/proxy_interaction_r3d_stage3_clip_rgbd_handoff_spatial_seed17/summary.json artifacts/reports/reveal_handoff_compare_serious/reveal_benchmark.json
compact handoff artifacts/outputs/r3d_handoff/proxy_interaction_r3d_stage3_clip_rgbd_handoff_compact_seed17/checkpoint_best.pt artifacts/outputs/r3d_handoff/proxy_interaction_r3d_stage3_clip_rgbd_handoff_compact_seed17/summary.json artifacts/reports/reveal_handoff_compact_train_probe/reveal_benchmark.json
compact-phase handoff artifacts/outputs/r3d_handoff_phase/proxy_interaction_r3d_stage3_clip_rgbd_handoff_compact_phase_seed17/checkpoint_best.pt artifacts/outputs/r3d_handoff_phase/proxy_interaction_r3d_stage3_clip_rgbd_handoff_compact_phase_seed17/summary.json artifacts/reports/reveal_phase_compare_serious_compact/reveal_benchmark.json
spatial-phase handoff artifacts/outputs/r3d_handoff_phase/proxy_interaction_r3d_stage3_clip_rgbd_handoff_spatial_phase_seed17/checkpoint_best.pt artifacts/outputs/r3d_handoff_phase/proxy_interaction_r3d_stage3_clip_rgbd_handoff_spatial_phase_seed17/summary.json artifacts/reports/reveal_phase_compare_serious_spatial_compactwm/reveal_benchmark.json

RLBench Current Checkpoints

Run Checkpoint Related files
subset3 valid9 artifacts/outputs/rlbench_current/rlbench_subset3_backbone_only_clip_current_valid9/checkpoint_best.pt artifacts/outputs/rlbench_current/rlbench_subset3_backbone_only_clip_current_valid9/checkpoint_stable.pt
subset3 common23 artifacts/outputs/rlbench_current/rlbench_subset3_backbone_only_clip_current_common23/checkpoint_best.pt artifacts/outputs/rlbench_current/rlbench_subset3_backbone_only_clip_current_common23/checkpoint_stable.pt
lift-ball wide artifacts/outputs/rlbench_current/rlbench_lift_ball_backbone_only_clip_current_wide/checkpoint_best.pt artifacts/outputs/rlbench_current/rlbench_lift_ball_backbone_only_clip_current_wide/checkpoint_stable.pt
push-box step1 artifacts/outputs/rlbench_current/rlbench_push_box_backbone_only_clip_step1/checkpoint_best.pt artifacts/reports/rlbench_push_box_step1_ep1_ik_c1/rollout_eval.json, artifacts/reports/rlbench_push_box_knn_step1_ep5_top1_dense/rollout_eval.json

RLBench Result Files

Artifact File
lift-ball wide, one-step replanning artifacts/reports/rlbench_lift_ball_wide_len160_ep1_ik_c1/rollout_eval.json
push-box step1, one-step replanning artifacts/reports/rlbench_push_box_step1_ep1_ik_c1/rollout_eval.json
push-box step1, one-step replanning, delta_scale=0.05 artifacts/reports/rlbench_push_box_step1_ep1_ik_c1_s005/rollout_eval.json
push-box kNN, episodes=1 artifacts/reports/rlbench_push_box_knn_step1_ep1/rollout_eval.json
push-box kNN, episodes=5, top_k=5 artifacts/reports/rlbench_push_box_knn_step1_ep5/rollout_eval.json
push-box kNN, episodes=5, top_k=1, dense bank artifacts/reports/rlbench_push_box_knn_step1_ep5_top1_dense/rollout_eval.json

R3D Proxy Runs

Run Config Seed Checkpoint Summary Benchmark Diagnostics
stage1 dummy proxy_interaction_r3d_stage1_dummy.yaml 13 artifacts/outputs/r3d/proxy_interaction_r3d_stage1_dummy_seed13/checkpoint_best.pt artifacts/outputs/r3d/proxy_interaction_r3d_stage1_dummy_seed13/summary.json artifacts/outputs/r3d/proxy_interaction_r3d_stage1_dummy_seed13/benchmark_full/reveal_benchmark.json artifacts/outputs/r3d/proxy_interaction_r3d_stage1_dummy_seed13/diagnostics_full/proxy_diagnostics.json
stage1 dummy proxy_interaction_r3d_stage1_dummy.yaml 14 artifacts/outputs/r3d/proxy_interaction_r3d_stage1_dummy_seed14/checkpoint_best.pt artifacts/outputs/r3d/proxy_interaction_r3d_stage1_dummy_seed14/summary.json artifacts/outputs/r3d/proxy_interaction_r3d_stage1_dummy_seed14/benchmark_full/reveal_benchmark.json artifacts/outputs/r3d/proxy_interaction_r3d_stage1_dummy_seed14/diagnostics_full/proxy_diagnostics.json
stage1 dummy proxy_interaction_r3d_stage1_dummy.yaml 15 artifacts/outputs/r3d/proxy_interaction_r3d_stage1_dummy_seed15/checkpoint_best.pt artifacts/outputs/r3d/proxy_interaction_r3d_stage1_dummy_seed15/summary.json artifacts/outputs/r3d/proxy_interaction_r3d_stage1_dummy_seed15/benchmark_full/reveal_benchmark.json artifacts/outputs/r3d/proxy_interaction_r3d_stage1_dummy_seed15/diagnostics_full/proxy_diagnostics.json
stage2 dummy proxy_interaction_r3d_stage2_dummy.yaml 21 artifacts/outputs/r3d/proxy_interaction_r3d_stage2_dummy_seed21/checkpoint_best.pt artifacts/outputs/r3d/proxy_interaction_r3d_stage2_dummy_seed21/summary.json artifacts/outputs/r3d/proxy_interaction_r3d_stage2_dummy_seed21/benchmark_full/reveal_benchmark.json artifacts/outputs/r3d/proxy_interaction_r3d_stage2_dummy_seed21/diagnostics_full/proxy_diagnostics.json
stage2 dummy proxy_interaction_r3d_stage2_dummy.yaml 22 artifacts/outputs/r3d/proxy_interaction_r3d_stage2_dummy_seed22/checkpoint_best.pt artifacts/outputs/r3d/proxy_interaction_r3d_stage2_dummy_seed22/summary.json artifacts/outputs/r3d/proxy_interaction_r3d_stage2_dummy_seed22/benchmark_full/reveal_benchmark.json artifacts/outputs/r3d/proxy_interaction_r3d_stage2_dummy_seed22/diagnostics_full/proxy_diagnostics.json
stage2 dummy proxy_interaction_r3d_stage2_dummy.yaml 23 artifacts/outputs/r3d/proxy_interaction_r3d_stage2_dummy_seed23/checkpoint_best.pt artifacts/outputs/r3d/proxy_interaction_r3d_stage2_dummy_seed23/summary.json artifacts/outputs/r3d/proxy_interaction_r3d_stage2_dummy_seed23/benchmark_full/reveal_benchmark.json artifacts/outputs/r3d/proxy_interaction_r3d_stage2_dummy_seed23/diagnostics_full/proxy_diagnostics.json
stage1 clip proxy_interaction_r3d_stage1_clip.yaml 7 artifacts/outputs/r3d/proxy_interaction_r3d_stage1_clip_seed7/checkpoint_best.pt artifacts/outputs/r3d/proxy_interaction_r3d_stage1_clip_seed7/summary.json artifacts/outputs/r3d/proxy_interaction_r3d_stage1_clip_seed7/benchmark_full/reveal_benchmark.json artifacts/outputs/r3d/proxy_interaction_r3d_stage1_clip_seed7/diagnostics_full/proxy_diagnostics.json
stage1 clip proxy_interaction_r3d_stage1_clip.yaml 8 artifacts/outputs/r3d/proxy_interaction_r3d_stage1_clip_seed8/checkpoint_best.pt artifacts/outputs/r3d/proxy_interaction_r3d_stage1_clip_seed8/summary.json artifacts/outputs/r3d/proxy_interaction_r3d_stage1_clip_seed8/benchmark_full/reveal_benchmark.json artifacts/outputs/r3d/proxy_interaction_r3d_stage1_clip_seed8/diagnostics_full/proxy_diagnostics.json
stage1 clip proxy_interaction_r3d_stage1_clip.yaml 9 artifacts/outputs/r3d/proxy_interaction_r3d_stage1_clip_seed9/checkpoint_best.pt artifacts/outputs/r3d/proxy_interaction_r3d_stage1_clip_seed9/summary.json artifacts/outputs/r3d/proxy_interaction_r3d_stage1_clip_seed9/benchmark_full/reveal_benchmark.json artifacts/outputs/r3d/proxy_interaction_r3d_stage1_clip_seed9/diagnostics_full/proxy_diagnostics.json
stage2 clip proxy_interaction_r3d_stage2_clip.yaml 11 artifacts/outputs/r3d/proxy_interaction_r3d_stage2_clip_seed11/checkpoint_best.pt artifacts/outputs/r3d/proxy_interaction_r3d_stage2_clip_seed11/summary.json artifacts/outputs/r3d/proxy_interaction_r3d_stage2_clip_seed11/benchmark_full/reveal_benchmark.json artifacts/outputs/r3d/proxy_interaction_r3d_stage2_clip_seed11/diagnostics_full/proxy_diagnostics.json
stage2 clip proxy_interaction_r3d_stage2_clip.yaml 12 artifacts/outputs/r3d/proxy_interaction_r3d_stage2_clip_seed12/checkpoint_best.pt artifacts/outputs/r3d/proxy_interaction_r3d_stage2_clip_seed12/summary.json artifacts/outputs/r3d/proxy_interaction_r3d_stage2_clip_seed12/benchmark_full/reveal_benchmark.json artifacts/outputs/r3d/proxy_interaction_r3d_stage2_clip_seed12/diagnostics_full/proxy_diagnostics.json
stage2 clip proxy_interaction_r3d_stage2_clip.yaml 13 artifacts/outputs/r3d/proxy_interaction_r3d_stage2_clip_seed13/checkpoint_best.pt artifacts/outputs/r3d/proxy_interaction_r3d_stage2_clip_seed13/summary.json artifacts/outputs/r3d/proxy_interaction_r3d_stage2_clip_seed13/benchmark_full/reveal_benchmark.json artifacts/outputs/r3d/proxy_interaction_r3d_stage2_clip_seed13/diagnostics_full/proxy_diagnostics.json
stage3 clip rgbd proxy_interaction_r3d_stage3_clip_rgbd.yaml 17 artifacts/outputs/r3d/proxy_interaction_r3d_stage3_clip_rgbd_seed17/checkpoint_best.pt artifacts/outputs/r3d/proxy_interaction_r3d_stage3_clip_rgbd_seed17/summary.json artifacts/outputs/r3d/proxy_interaction_r3d_stage3_clip_rgbd_seed17/benchmark_full/reveal_benchmark.json artifacts/outputs/r3d/proxy_interaction_r3d_stage3_clip_rgbd_seed17/diagnostics_full/proxy_diagnostics.json
stage3 clip rgbd proxy_interaction_r3d_stage3_clip_rgbd.yaml 18 artifacts/outputs/r3d/proxy_interaction_r3d_stage3_clip_rgbd_seed18/checkpoint_best.pt artifacts/outputs/r3d/proxy_interaction_r3d_stage3_clip_rgbd_seed18/summary.json artifacts/outputs/r3d/proxy_interaction_r3d_stage3_clip_rgbd_seed18/benchmark_full/reveal_benchmark.json artifacts/outputs/r3d/proxy_interaction_r3d_stage3_clip_rgbd_seed18/diagnostics_full/proxy_diagnostics.json
stage3 clip rgbd proxy_interaction_r3d_stage3_clip_rgbd.yaml 19 artifacts/outputs/r3d/proxy_interaction_r3d_stage3_clip_rgbd_seed19/checkpoint_best.pt artifacts/outputs/r3d/proxy_interaction_r3d_stage3_clip_rgbd_seed19/summary.json artifacts/outputs/r3d/proxy_interaction_r3d_stage3_clip_rgbd_seed19/benchmark_full/reveal_benchmark.json artifacts/outputs/r3d/proxy_interaction_r3d_stage3_clip_rgbd_seed19/diagnostics_full/proxy_diagnostics.json

Ablation Benchmark Files

Ablation File
stage1 dummy no_planner artifacts/outputs/r3d/proxy_interaction_r3d_stage1_dummy_seed13/benchmark_no_planner/reveal_benchmark.json
stage1 dummy no_role_symmetry artifacts/outputs/r3d/proxy_interaction_r3d_stage1_dummy_seed13/benchmark_no_role_symmetry/reveal_benchmark.json
stage2 dummy no_world_model artifacts/outputs/r3d/proxy_interaction_r3d_stage2_dummy_seed21/benchmark_no_world_model/reveal_benchmark.json
stage2 dummy no_world_model pre-fix backup artifacts/outputs/r3d/proxy_interaction_r3d_stage2_dummy_seed21/benchmark_no_world_model/reveal_benchmark_pre_null_rollout_fix.json
stage2 dummy short_history artifacts/outputs/r3d/proxy_interaction_r3d_stage2_dummy_seed21/benchmark_short_history/reveal_benchmark.json
stage3 clip RGB-D no_depth artifacts/outputs/r3d/proxy_interaction_r3d_stage3_clip_rgbd_seed17/benchmark_no_depth/reveal_benchmark.json

Equivalent files exist under the other seed directories.

Integration Artifacts

Artifact File
RLBench import/config smoke artifacts/outputs/r3d/rlbench_smokes/smoke_test_output.txt
RLBench open_drawer launch smoke artifacts/outputs/r3d/rlbench_smokes/launch_smoke_open_drawer.txt
RLBench open_drawer rollout artifacts/outputs/r3d/rlbench_open_drawer_r3d_rollout/rollout_eval.json
PerAct2 13-task launch smoke summary artifacts/outputs/r3d/peract2_13_launch_smoke/launch_smoke_summary.json

Historical References

File Purpose
regression/baselines.md historical baseline metrics from the downloaded snapshot
results/phase_tracking.md phase-by-phase acceptance tracking