Model Card for Asymmetric-Executor-Nav-Thinking
Model Summary
This model is a fine-tuned version of Qwen3-VL-8B-Thinking (a reasoning-optimized variant). It serves as the Low-Level Executor for the Cookie and 2-Keys navigation domains. Compared to the Instruct version, this model is optimized to handle more complex logical dependencies in spatial reasoning and memory retention required for backtracking tasks.
Model Details
Task: Complex Visual Navigation (Cookie & 2-Keys Domains)
Methodology: Event-triggered planning with a Reflection-Action Loop.
Training: 9-Step Curriculum Learning with Key Episode Rebalancing.
Intended Use
Designed for scenarios where the "Executor" requires deeper logical reasoning capability to verify state changes in partially observable environments (e.g., "Did I already press the button in the hallway?").
Input: RGB Image + Semantic Subgoal.
Output: Structured CoT (Perception -> Verification -> Action).
Performance
Cookie Domain Success Rate: ~100%.
2-Keys Domain Success Rate: ~98%.
Note: The "Thinking" backbone demonstrates superior performance in maintaining trajectory consistency compared to the standard Instruct backbone in complex memory tasks.
- Downloads last month
- 1
Model tree for Wuduandaun/curr_cookie_8T
Base model
Qwen/Qwen3-VL-8B-Thinking