--- license: apache-2.0 pipeline_tag: robotics library_name: transformers base_model: Qwen/Qwen2.5-VL-7B-Instruct datasets: - Chrono666/Nav-R2-OVON-CoT-Dataset tags: - robotics - navigation - object-goal-navigation - vision-language-model - qwen --- # Nav-$R^2$: Dual-Relation Reasoning for Generalizable Open-Vocabulary Object-Goal Navigation This repository contains the official implementation of the paper [Nav-$R^2$: Dual-Relation Reasoning for Generalizable Open-Vocabulary Object-Goal Navigation](https://huggingface.co/papers/2512.02400). Object-goal navigation in open-vocabulary settings requires agents to locate novel objects in unseen environments. Nav-$R^2$ proposes a framework that explicitly models target-environment and environment-action relationships through structured Chain-of-Thought (CoT) reasoning and a Similarity-Aware Memory. This approach enables state-of-the-art performance in localizing unseen objects efficiently while maintaining real-time inference. For more details on the code, installation, training, and evaluation, please refer to the [GitHub repository](https://github.com/AMAP-EAI/Nav-R2). ## Overview

### Pipeline and Structure

### Results on OVON Here shows the results on OVON dataset. Nav-R2 is trained via **ONLY SFT** receiving **ONLY RGB observations** from **ONLY first-person view**, and achieves the best SR on the val-unseen split.

## Citation If you find our work helpful or inspiring, please feel free to cite it. ```bibtex @article{zhou2025navr2, title={Nav-R2: Dual-Relation Reasoning for Generalizable Open-Vocabulary Object-Goal Navigation}, author={Authors names and affiliations will be added after review}, journal={arXiv preprint arXiv:2512.02400}, year={2025} } ```