VLAarchtests / artifacts /outputs /interaction_debug /reveal_eval_commit8_compare /reveal_benchmark.md
Reveal Proxy Benchmark
interaction
- checkpoint: /workspace/VLAarchtests/artifacts/outputs/interaction_debug/proxy_interaction_state_actionhist/checkpoint_best.pt
- mean_success: 0.514
- visibility_integral: 24.378
- corridor_availability: 0.719
- reocclusion_rate: 0.036
- persistence_horizon_mae: 1.526
- disturbance_cost: 0.338
- foliage_proxy_success: 0.292
- bag_proxy_success: 0.542
- cloth_proxy_success: 0.708
backbone
- checkpoint: /workspace/VLAarchtests/artifacts/outputs/reveal_runs/proxy_backbone_only/checkpoint_best.pt
- mean_success: 0.417
- visibility_integral: 16.212
- corridor_availability: 0.510
- reocclusion_rate: 0.036
- persistence_horizon_mae: 0.000
- disturbance_cost: 0.141
- foliage_proxy_success: 0.292
- bag_proxy_success: 0.333
- cloth_proxy_success: 0.625
reveal
- checkpoint: /workspace/VLAarchtests/artifacts/outputs/reveal_runs/proxy_reveal_state/checkpoint_best.pt
- mean_success: 0.556
- visibility_integral: 32.113
- corridor_availability: 0.806
- reocclusion_rate: 0.058
- persistence_horizon_mae: 1.963
- disturbance_cost: 0.227
- foliage_proxy_success: 0.417
- bag_proxy_success: 0.583
- cloth_proxy_success: 0.667