Nav-R2 / README.md
nielsr's picture
nielsr HF Staff
Add model card for Nav-R2
be4aabe verified
|
raw
history blame
2.2 kB
metadata
license: apache-2.0
pipeline_tag: robotics
library_name: transformers
base_model: Qwen/Qwen2.5-VL-7B-Instruct
datasets:
  - Chrono666/Nav-R2-OVON-CoT-Dataset
tags:
  - robotics
  - navigation
  - object-goal-navigation
  - vision-language-model
  - qwen

Nav-$R^2$: Dual-Relation Reasoning for Generalizable Open-Vocabulary Object-Goal Navigation

This repository contains the official implementation of the paper Nav-$R^2$: Dual-Relation Reasoning for Generalizable Open-Vocabulary Object-Goal Navigation.

Object-goal navigation in open-vocabulary settings requires agents to locate novel objects in unseen environments. Nav-$R^2$ proposes a framework that explicitly models target-environment and environment-action relationships through structured Chain-of-Thought (CoT) reasoning and a Similarity-Aware Memory. This approach enables state-of-the-art performance in localizing unseen objects efficiently while maintaining real-time inference.

For more details on the code, installation, training, and evaluation, please refer to the GitHub repository.

Overview

Pipeline and Structure

Results on OVON

Here shows the results on OVON dataset. Nav-R2 is trained via ONLY SFT receiving ONLY RGB observations from ONLY first-person view, and achieves the best SR on the val-unseen split.

Citation

If you find our work helpful or inspiring, please feel free to cite it.

@article{zhou2025navr2,
  title={Nav-R2: Dual-Relation Reasoning for Generalizable Open-Vocabulary Object-Goal Navigation},
  author={Authors names and affiliations will be added after review},
  journal={arXiv preprint arXiv:2512.02400},
  year={2025}
}