VLAarchtests / artifacts /outputs /interaction_debug /reveal_eval_commit2_compare /reveal_benchmark.md
Reveal Proxy Benchmark
interaction
- checkpoint: /workspace/VLAarchtests/artifacts/outputs/interaction_debug/proxy_interaction_state_actionhist/checkpoint_best.pt
- mean_success: 0.514
- visibility_integral: 32.360
- corridor_availability: 0.880
- reocclusion_rate: 0.000
- persistence_horizon_mae: 1.142
- disturbance_cost: 0.495
- foliage_proxy_success: 0.417
- bag_proxy_success: 0.542
- cloth_proxy_success: 0.583
backbone
- checkpoint: /workspace/VLAarchtests/artifacts/outputs/reveal_runs/proxy_backbone_only/checkpoint_best.pt
- mean_success: 0.542
- visibility_integral: 30.581
- corridor_availability: 0.868
- reocclusion_rate: 0.000
- persistence_horizon_mae: 0.000
- disturbance_cost: 0.474
- foliage_proxy_success: 0.417
- bag_proxy_success: 0.583
- cloth_proxy_success: 0.625
reveal
- checkpoint: /workspace/VLAarchtests/artifacts/outputs/reveal_runs/proxy_reveal_state/checkpoint_best.pt
- mean_success: 0.556
- visibility_integral: 29.509
- corridor_availability: 0.861
- reocclusion_rate: 0.000
- persistence_horizon_mae: 2.366
- disturbance_cost: 0.470
- foliage_proxy_success: 0.417
- bag_proxy_success: 0.583
- cloth_proxy_success: 0.667