Robotics
Transformers
Safetensors
File size: 2,634 Bytes
cc3ac0f
 
9cf5d5b
 
cc3ac0f
9cf5d5b
cc3ac0f
 
9cf5d5b
 
cc3ac0f
 
 
 
 
 
f9bfc4e
49bcc65
f9bfc4e
235280b
77d71f0
cc3ac0f
 
9cf5d5b
cc3ac0f
2e69441
cc3ac0f
 
 
2e69441
9cf5d5b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
license: cc-by-nc-sa-4.0
library_name: transformers
pipeline_tag: robotics
---

# VL-LN-Bench basemodel

This repository contains the base model for the paper [VL-LN Bench: Towards Long-horizon Goal-oriented Navigation with Active Dialogs](https://huggingface.co/papers/2512.22342).

![License](https://img.shields.io/badge/License-CC_BY--NC--SA_4.0-lightgrey.svg)
![Transformers](https://img.shields.io/badge/%F0%9F%A4%97%20Transformers-9cf?style=flat)
![PyTorch](https://img.shields.io/badge/PyTorch-EE4C2C?logo=pytorch&logoColor=white)

## Model Description

VL-LN Bench is the first benchmark for **Interactive Instance Goal Navigation (IIGN)**, where an embodied agent must locate a specific object instance in a realistic 3D home while engaging in **free-form natural-language dialogue**. It also provides an **automated data-collection pipeline** that generates large-scale training data for learning interactive navigation behaviors. Using this dataset, we train an **IIGN base model** that shares the same architecture as **InternVLA-N1**.

The resulting model demonstrates baseline competence on IIGN: it can search for a specific instance in **previously unseen** environments. During exploration, the agent can either **move** by predicting a pixel-goal waypoint or **ask** a question to reduce ambiguity and improve task success and efficiency.

### Resources

[![Code](https://img.shields.io/badge/GitHub-VL--LN--Bench-181717?logo=github)](https://github.com/InternRobotics/InternNav)
[![VL-LN Paper — arXiv](https://img.shields.io/badge/arXiv-VL--LN--Bench-B31B1B?logo=arxiv&logoColor=white)](https://arxiv.org/abs/2512.22342)
[![Project Page — VL-LN-Bench](https://img.shields.io/badge/Project_Page-VL--LN--Bench-4285F4?logo=google-chrome&logoColor=white)](https://0309hws.github.io/VL-LN.github.io/)
[![Dataset](https://img.shields.io/badge/Dataset-VL--LN--Bench-FF6F00?logo=huggingface&logoColor=white)](https://huggingface.co/datasets/InternRobotics/VL-LN-Bench)

## Usage

For inference and evaluation, please refer to the [VL-LN-Bench repository](https://github.com/InternRobotics/VL-LN).

## Citation

If you find our work helpful, please cite:

```bibtex
@misc{huang2025vllnbenchlonghorizongoaloriented,
      title={VL-LN Bench: Towards Long-horizon Goal-oriented Navigation with Active Dialogs}, 
      author={Wensi Huang and Shaohao Zhu and Meng Wei and Jinming Xu and Xihui Liu and Hanqing Wang and Tai Wang and Feng Zhao and Jiangmiao Pang},
      year={2025},
      eprint={2512.22342},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2512.22342}, 
}
```