Jzzhang
/

NaVid

Model card Files Files and versions

NaVid / README.md

Jzzhang's picture

Update README.md

1f96dbd verified about 1 year ago

|

history blame contribute delete

774 Bytes

	---
	license: cc-by-nc-4.0
	---
	Pretrained Weights of [NaVid](https://pku-epic.github.io/NaVid/): Video-based VLM Plans the Next Step for Vision-and-Language Navigation (RSS 2024)

	The model is trained on samples collected from the training splits of [VLN-CE](https://github.com/jacobkrantz/VLN-CE) R2R and RxR.

	\| Evaliation Benchmark \| TL \| NE \| OS \| SR \| SPL \|
	\|----------------------\|:----:\|:----:\|:----:\|:----:\|:----:\|
	\| VLN-CE R2R Val. \| 10.7 \| 5.65 \| 49.2 \| 41.9 \| 36.5 \|
	\| [VLN-CE R2R Test](https://eval.ai/web/challenges/challenge-page/719/leaderboard/1966) \| 11.3 \| 5.39 \| 52 \| 45 \| 39 \|
	\| VLN-CE RxR Val. \| 15.4 \| 5.72 \| 55.6 \| 45.7 \| 38.2 \|

	The related inference code can be found in [here](https://github.com/jzhzhang/NaVid-VLN-CE)