--- pipeline_tag: robotics library_name: transformers license: mit --- This repository contains models for the **VLN-PE Benchmark**, as presented in the paper [Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities](https://huggingface.co/papers/2507.13019). VLN-PE introduces a physically realistic Vision-and-Language Navigation platform supporting humanoid, quadruped, and wheeled robots, and systematically evaluates several ego-centric VLN methods in physical robotic settings. For more details, visit the [project page](https://crystalsixone.github.io/vln_pe.github.io/) or the main [GitHub repository](https://github.com/InternRobotics/InternNav). ## VLN-PE Benchmark
Model Dataset/Benchmark Val Seen Val Unseen Download
TL NE FR StR OS SR SPL TL NE FR StR OS SR SPL
Zero-shot transfer evaluation from VLN-CE
Seq2Seq-Full R2R VLN-PE 7.80 7.62 20.21 3.04 19.3 15.2 12.79 7.73 7.18 18.04 3.04 22.42 16.48 14.11 model
CMA-Full R2R VLN-PE 6.62 7.37 20.06 3.95 18.54 16.11 14.61 6.58 7.09 17.07 3.79 20.86 16.93 15.24 model
Train on VLN-PE
Seq2Seq R2R VLN-PE 10.61 7.53 27.36 4.26 32.67 19.75 14.68 10.85 7.88 26.8 5.57 28.13 15.14 10.77 model
CMA R2R VLN-PE 11.13 7.59 23.71 3.19 34.94 21.58 16.1 11.16 7.98 22.64 3.27 33.11 19.15 14.05 model
RDP R2R VLN-PE 13.26 6.76 27.51 1.82 38.6 25.08 17.07 12.7 6.72 24.57 3.11 36.9 25.24 17.73 model
Seq2Seq+ R2R VLN-PE 10.22 7.75 33.43 3.19 30.09 16.86 12.54 9.88 7.85 26.27 6.52 28.79 16.56 12.7 model
CMA+ R2R VLN-PE 8.86 7.14 23.56 3.5 36.17 25.84 21.75 8.79 7.26 21.75 3.27 31.4 22.12 18.65 model
## Citation If you find our work helpful, please cite: ```bibtex @inproceedings{vlnpe, title={Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities}, author={Wang, Liuyi and Xia, Xinyuan and Zhao, Hui and Wang, Hanqing and Wang, Tai and Chen, Yilun and Liu, Chengju and Chen, Qijun and Pang, Jiangmiao}, booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, year={2025} } @misc{internnav2025, title = {{InternNav: InternRobotics'} open platform for building generalized navigation foundation models}, author = {InternNav Contributors}, howpublished={\url{https://github.com/InternRobotics/InternNav}}, year = {2025} } ```