Model Card for Asymmetric-Executor-Nav-Thinking

Model Summary

This model is a fine-tuned version of Qwen3-VL-8B-Thinking (a reasoning-optimized variant). It serves as the Low-Level Executor for the Cookie and 2-Keys navigation domains. Compared to the Instruct version, this model is optimized to handle more complex logical dependencies in spatial reasoning and memory retention required for backtracking tasks.

Model Details

Task: Complex Visual Navigation (Cookie & 2-Keys Domains)

Methodology: Event-triggered planning with a Reflection-Action Loop.

Training: 9-Step Curriculum Learning with Key Episode Rebalancing.

Intended Use

Designed for scenarios where the "Executor" requires deeper logical reasoning capability to verify state changes in partially observable environments (e.g., "Did I already press the button in the hallway?").

Input: RGB Image + Semantic Subgoal.

Output: Structured CoT (Perception -> Verification -> Action).

Performance

Cookie Domain Success Rate: ~100%.

2-Keys Domain Success Rate: ~98%.

Note: The "Thinking" backbone demonstrates superior performance in maintaining trajectory consistency compared to the standard Instruct backbone in complex memory tasks.

Downloads last month: 1

Safetensors

Model size

9B params

Tensor type

BF16

Model tree for Wuduandaun/curr_cookie_8T

Base model

Qwen/Qwen3-VL-8B-Thinking

Finetuned

(70)

this model