Model Card for Asymmetric-Executor-Nav-Thinking

Model Summary

This model is a fine-tuned version of Qwen3-VL-8B-Thinking (a reasoning-optimized variant). It serves as the Low-Level Executor for the Cookie and 2-Keys navigation domains. Compared to the Instruct version, this model is optimized to handle more complex logical dependencies in spatial reasoning and memory retention required for backtracking tasks.

Model Details

Task: Complex Visual Navigation (Cookie & 2-Keys Domains)

Methodology: Event-triggered planning with a Reflection-Action Loop.

Training: 9-Step Curriculum Learning with Key Episode Rebalancing.

Intended Use

Designed for scenarios where the "Executor" requires deeper logical reasoning capability to verify state changes in partially observable environments (e.g., "Did I already press the button in the hallway?").

Input: RGB Image + Semantic Subgoal.

Output: Structured CoT (Perception -> Verification -> Action).

Performance

Cookie Domain Success Rate: ~100%.

2-Keys Domain Success Rate: ~98%.

Note: The "Thinking" backbone demonstrates superior performance in maintaining trajectory consistency compared to the standard Instruct backbone in complex memory tasks.

Downloads last month
1
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Wuduandaun/curr_cookie_8T

Finetuned
(22)
this model