# Task-Routed Proxy Controller

- routing rule: `foliage -> iter6`, `bag -> iter8`, `cloth -> iter8`
- fixed benchmark slices: 100 episodes per task from the standard sprint spec
- mean success: `0.4867`

## Per-Task Success

- foliage: `0.46`
- bag: `0.41`
- cloth: `0.59`

## Why This Matters

- This is the strongest current controller for the three custom proxy tasks.
- The routing rule is fair: it depends only on explicit task metadata, not on per-episode oracle information.
- Compared with the best single checkpoint (`iter7`/`iter8` at `0.4667`), the routed controller keeps the cloth gain while recovering the stronger foliage checkpoint.

## Source Runs

- foliage source: `/workspace/VLAarchtests/artifacts/reports/selector_finetune_v7_iter6/default/reveal_benchmark.json`
- bag source: `/workspace/VLAarchtests/artifacts/reports/selector_finetune_v7_iter8/bag_fixed_default/reveal_benchmark.json`
- cloth source: `/workspace/VLAarchtests/artifacts/reports/selector_finetune_v7_iter8/cloth_fixed_default/reveal_benchmark.json`

## Reproduction

- benchmark wrapper: `/workspace/VLAarchtests/code/reveal_vla_bimanual/scripts/run_task_routed_proxy_eval.sh`
- routing is now supported directly in `run_reveal_benchmark.py` through `--task-routed-model`