NaVid / README.md
Jzzhang's picture
Update README.md
1f96dbd verified
---
license: cc-by-nc-4.0
---
Pretrained Weights of [NaVid](https://pku-epic.github.io/NaVid/): Video-based VLM Plans the Next Step for Vision-and-Language Navigation (RSS 2024)
The model is trained on samples collected from the training splits of [VLN-CE](https://github.com/jacobkrantz/VLN-CE) R2R and RxR.
| Evaliation Benchmark | TL | NE | OS | SR | SPL |
|----------------------|:----:|:----:|:----:|:----:|:----:|
| VLN-CE R2R Val. | 10.7 | 5.65 | 49.2 | 41.9 | 36.5 |
| [VLN-CE R2R Test](https://eval.ai/web/challenges/challenge-page/719/leaderboard/1966) | 11.3 | 5.39 | 52 | 45 | 39 |
| VLN-CE RxR Val. | 15.4 | 5.72 | 55.6 | 45.7 | 38.2 |
The related inference code can be found in [here](https://github.com/jzhzhang/NaVid-VLN-CE)